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NOVEL PLANT ACYLTRANSFERASES 

5 

INTRODUCTION 

This application claims the benefit of U.S. Provisional Application Serial No. 
60/101,939 filed September 25, 1998. 

10 

Technical Field 

The present invention is directed to nucleic acid and amino acid sequences and 
constructs, and methods related thereto, 

15 Background 

Through the development of plant genetic engineering techniques, it is now possible to 
produce transgenic varieties of plant species to provide plants which have novel and desirable 
characteristics. For example, it is now possible to genetically engineer plants for tolerance to 
environmental stresses, such as resistance to pathogens and tolerance to herbicides and to 
2 0 improve the quality characteristics of the plant, for example improved fatty acid compositions. 
However, the number of useful nucleotide sequences for the engineering of such 
characteristics is thus far limited and the speed with which new useful nucleotide sequences 
for engineering new characteristics is slow. 

The characterization of various acyltransferase proteins is useful for the further study 

2 5 of plant fatty acid synthesis systems and for the development of novel and/or alternative oils 

sources. Studies of plant mechanisms may provide means to further enhance, control, 
modify, or otherwise alter the total fatty acyl composition of triglycerides and oils. 
Furthermore, the elucidation of the factor(s) critical to the natural production of fatty acids in 
plants is desired, including the purification of such factors and the characterization of 

3 0 element(s) and/or cofactors which enhance the efficiency of the system. Of particular interest 

are the nucleic acid sequences of genes encoding proteins which may be useful for 
applications in genetic engineering. 
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SUMMARY OF THE INVENTION 

5 The present invention provides nucleic acid encoding for amino acid 

sequences for a class of proteins which are related to acyltransferase proteins. Such proteins 
are referred to herein as acyltransferase related or acyltransferase like proteins. 

By this invention, nucleic acid sequences encoding these acyltransferase related 
proteins may now be characterized with respect to enzyme activity. In particular. 

10 identification and isolation of nucleic acid sequences encoding for acyltransferase related 
proteins from Arabidopsis, yeast, corn, and soybean are provided. 

Thus, this invention encompasses acyltransferase related nucleic acid sequences and 
the corresponding amino acid sequences, and the use of these nucleic acid sequences in the 
preparation of oligonucleotides containing such acyltransferase related encoding sequences 

1 5 for analysis and recovery of plant acyltransferase related gene sequences. The acyltransferase 
related encoding sequence may encode a complete or partial sequence depending upon the 
intended use. All or a portion of the genomic sequence, or cDNA sequence, is intended. 

Of special interest are recombinant DNA constructs which provide for transcription or 
transcription and translation (expression) of the acyltransferase related sequences in host 

2 0 cells. In particular, constructs which are capable of transcription or transcription and 

translation in plant host cells are preferred. For some applications a reduction in sequences 
encoding acyltransferase related sequences may be desired. Thus, recombinant constructs 
may be designed having the acyltransferase related sequences in a reverse orientation for 
expression of an anti-sense sequence or use of co-suppression, also known as "transwitch", 

2 5 constructs may be useful. Such constructs may contain a variety of regulatory regions 

including transcriptional initiation regions obtained from genes preferentially expressed in 
plant seed tissue. For some uses, it may be desired to use the transcriptional and translational 
initiation regions of the acyltransferase related gene either with the acyltransferase related 
encoding sequence or to direct the transcription and translation of a heterologous sequence. 

3 0 Also considered in this invention are the plants and seeds containing the constructs 

and polynucleotides of this invention. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides the 204 amino acid conserved sequence profile identified from 
comparisons of gIyceroi-3-phosphate acyltransferase and various lysophosphatidic acid 
acyltransferase using PSI-BLAST. 

Figure 2 provides an amino acid sequence alignment for the acyltransferase 
sequences. The alignment shown is of the regions of the protein extending from about 30 
amino acids prior to the conserved H in the conserved sequence HXXXXD to 100 amino 
acids after, or downstream, of the P in the conserved PEG sequence motif of the 
acyltransferase-like sequences. 

Figure 3 provides schematics showing the relationship of the identified 
acyltransferases. The relationships described are derived from an alignment of the regions of 
the protein extending from about 30 amino acids prior to the conserved H in the conserved 
sequence HXXXXD to 100 amino acids after, or downstream, of the P in the conserved PEG 
sequence motif of the acyltransferase-like sequences. Figure 3A provide aphylogenetic tree 
showing the relationship of several acyltransferases. Figure 3B provides a table showing the 
percent similarities and percent divergence of the novel acyltransferases and known 
acyltransferases using the Clustal method with PAM250 residue weight table. 



DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the subject invention, nucleotide sequences are provided which are 
capable of coding sequences of amino acids, such as, a protein, polypeptide or peptide, which 
are related to nucleic acid sequences encoding acyltransferase proteins, referred to herein as 
acyltransferase-like or acyltransferase related. The novel nucleic acid sequences find use in 
the preparation of constructs to direct their expression in a host cell. Furthermore, the novel 
nucleic acid sequences may find use in the preparation of plant expression constructs to 
modify the fatty acid composition of a plant cell. 

In one embodiment of the present invention, nucleic acid sequences, also referred to 
herein as polynucleotides, are identified from databases which are related to acyltransferases. 
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Isolated proteins, Polypeptides and Polynucleotides 

A first aspect of the present invention relates to isolated acyltransferase 
polynucleotides. The polynucleotide sequences of the present invention include isolated 
polynucleotides that encode the polypeptides of the invention having a deduced ammo acid 
sequence selected from the group of sequences set forth in the Sequence Listing and to other 
polynucleotide sequences closely related to such sequences and variants thereof. 

The invention provides a polynucleotide sequence identical over its entire length to 
each coding sequence as set forth in the Sequence Listing. The invention also provides the 
coding sequence for the mature polypeptide or a fragment thereof, as well as the coding 
sequence for the mature polypeptide or a fragment thereof in a reading frame with other 
coding sequences, such as those encoding a leader or secretory sequence, a pre-, pro-, or 
prepro- protein sequence. The polynucleotide can also include non-coding sequences, 
including for example, but not limited to, non-coding 5' and 3' sequences, such as the 
1 5 transcribed, untranslated sequences, termination signals, nbosome binding sites, sequences 
that stabilize mRNA, introns, polyadenylation signals, and additional coding sequence that 
encodes additional ammo acids. For example, a marker sequence can be included to facilitate 
the purification of the fused polypeptide. Polynucleotides of the pre.- cnt invention also 
include polynucleotides comprising a structural gene and the naturally associated sequences 
2 0 that control gene expression. 

The invention also includes polynucleotides of the formula: 

X-(Ri) n -(R2)"(R3)n-Y 

wherein, at the 5' end, X is hydrogen, and at the 3' end, Y is hydrogen or a metal, R, and R 3 
are any nucleic acid residue, n is an integer between 1 and 3000, preferably between 1 and 

2 5 1000 and R 2 is a nucleic acid sequence of the invention, particularly a nucleic acid sequence 

selected from the group set forth in the Sequence Listing and preferably SEQ IDNOs: 1, 3, 5, 
7, 9, 10, 12, 14, 16, 18, 20, 22, and 226-233. In the formula, R 2 is oriented so that its 5' end 
residue is at the left, bound to K u and its 3' end residue is at the right, bound to R 3 . Any 
stretch of nucleic acid residues denoted by either R group, where R is greater than 1, may be 

3 0 either a heteropolymer or a homopolymer, preferably a heteropolymer. 

The invention also relates to variants of the polynucleotides described herein that 
encode for variants of the polypeptides of the invention. Variants that are fragments of the 
polynucleotides of the invention can be used to synthesize full-length polynucleotides of the 



X5CID: <WO 0018889A2_L 



WO 00/18889 5 PCT/US99/22231 

invention. Preferred embodiments are polynucleotides encoding polypeptide variants wherein 
5 to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid residues of a polypeptide sequence of the 
invention are substituted, added or deleted, in any combination. Particularly preferred are 
substitutions, additions, and deletions that are silent such that they do not alter the properties 
or activities of the polynucleotide or polypeptide. 

Nucleotide sequences encoding acyltransferases may be obtained from natural sources 
or be partially or wholly artificially synthesized. They may directly correspond to an 
acyltransferase endogenous to a natural source or contain modified amino acid sequences, 
such as sequences which have been mutated, truncated, increased or the like. Acyltransferases 
may be obtained by a variety of methods, including but not limited to, partial or homogenous 
purification of protein extracts, protein modeling, nucleic acid probes, antibody preparations 
and sequence comparisons. Typically an acyltransferase will be derived in whole or in part 
from a natural source. A natural source includes, but is not limited to, prokaryotic and 
eukaryotic sources, including, bacteria, yeasts, plants, including algae, and the like. 

Of special interest are acyltransferases which are obtainable from eukaryotic sources, 
including those which are obtained, from plants, or from acyltransferases which are 
obtainable through the use of these sequences. "Obtainable" refers to those acyltransferases 
which have sufficiently similar sequences to that of the sequences provided herein to provide 
a biologically active protein of the present invention. 

Further preferred embodiments of the invention that are at least 50%, 60%, or 70% 
identical over their entire length to a polynucleotide encoding a polypeptide of the invention, 
and polynucleotides that are complementary to such polynucleotides. More preferable are 
polynucleotides that comprise a region that is at least 80% identical over its entire length to a 
polynucleotide encoding a polypeptide of the invention and polynucleotides that are 
complementary thereto. In this regard, polynucleotides at least 90% identical over their entire 
length are particularly preferred, those at least 95% identical are especially preferred. Further, 
those with at least 97% identity are highly preferred and those with at least 98% and 99% 
identity are particularly highly preferred, with those at least 99% being the most highly 
preferred. 

Preferred embodiments are polynucleotides that encode polypeptides that retain 
substantially the same biological function or activity as the mature polypeptides encoded by 
the polynucleotides set forth in the Sequence Listing. 
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The invention further relates to polynucleotides that hybridize to the above-described 
sequences. In particular, the invention relates to polynucleotides that hybridize under 
stringent conditions to the above-described polynucleotides. As used herein, the terms 
"stringent conditions" and "stringent hybridization conditions" mean that hybridization will 
5 generally occur if there is at least 95% and preferably at least 97% identity between the 

sequences. An example of stringent hybridization conditions is overnight incubation at 42=C 
in a solution comprising 50% formamide, 5x SSC (150 mM NaCl. 15 mM trisodium citrate), 
50 mM sodium phosphate (pH 7.6), 5x Denhardfs solution, 10% dextran sulfate, and 20 
micrograms/milliliter denatured, sheared salmon sperm DNA, followed by washing the 
10 hybridization support in 0.1 x SSC at approximately 65=C. Other hybridization and wash 
conditions are well known and are exemplified in Sambrook, et al., Molecular Cloning: A 
Laboratory Manual. Second Edition, cold Spring Harbor, NY (1989), particularly Chapter 11. 

The invention also provides a polynucleotide consisting essentially of a 
polynucleotide sequence obtainable by screening an appropriate library containing the 
15 complete gene for a polynucleotide sequence set for in the Sequence Listing under stringent 
hybridization conditions with a probe having the sequence of said polynucleotide sequence or 
a fragment thereof; and isolating said polynucleotide sequence. Fragments useful for 
obtaining such a polynucleotide include, for example, probes and primers as described herein. 
As discussed herein regarding polynucleotide assays of the invention, for example, 
2 0 polynucleotides of the invention can be used as a hybridization probe for RN A. cDNA, or 

genomic DNA to isolate full length cDNAs or genomic clones encoding a polypeptide and to 
isolate cDN A or genomic clones of other genes that have a high sequence similarity to a 
polynucleotide set forth in the Sequence Listing. Such probes will generally comprise at least 
15 bases. Preferably such probes will have at least 30 bases and can have at least 50 bases. 

2 5 Particularly preferred probes will have between 30 bases and 50 bases, inclusive. 

The coding region of each gene that comprises or is comprised by a polynucleotide 
sequence set forth in the Sequence Listing may be isolated by screening using a DNA 
sequence provided in the Sequence Listing to synthesize an oligonucleotide probe. A labeled 
oligonucleotide having a sequence complementary to that of a gene of the invention is then 

3 0 used to screen a library of cDNA, genomic DNA or mRNA to identify members of the library 

which hybridize to the probe. For example, synthetic oligonucleotides are prepared which 
correspond to the N-terminal sequence of the polypeptide. The partial sequences so prepared 
can then be used as probes to obtain acyltransferase clones from a gene library prepared from 
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a cell source of interest. Alternatively, where oligonucleotides of low degeneracy can be 
prepared from particular peptides, such probes may be used directly to screen gene libraries 
for gene sequences. In particular, screening of cDNA libraries in phage vectors is useful in 
such methods due to lower levels of background hybridization. 
5 Typically, a sequence obtainable from the use of nucleic acid probes will show 60- 

70% sequence identity between the target acyltransferase sequence and the encoding sequence 
used as a probe. However, lengthy sequences with as little as 50-60% sequence identity may 
also be obtained. The nucleic acid probes may be a lengthy fragment of the nucleic acid 
sequence, or may also be a shorter, oligonucleotide probe. When longer nucleic acid 

1 0 fragments are employed as probes (greater than about 100 bp), one may screen at lower 
stringencies in order to obtain sequences from the target sample which have 20-50% 
deviation (i.e., 50-80% sequence homology) from the sequences used as probe. 
Oligonucleotide probes can be considerably shorter than the entire nucleic acid sequence 
encoding an acyltransferase enzyme, but should be at least about 10, preferably at least about 

15 15, and more preferably at least about 20 nucleotides. A higher degree of sequence identity is 
desired when shorter regions are used as opposed to longer regions. It may thus be desirable 
to identify regions of highly conserved amino acid sequence to design oligonucleotide probes 
for detecting and recovering other related genes. Shorter probes are often particularly useful 
for polymerase chain reactions (PCR), especially when highly conserved sequences can be 

2 0 identified. (See, Gould, et a/., PNAS USA (1989) 56:1934-1938). 

The skilled artisan will appreciate that, in many cases, an isolated cDNA sequence 
will be incomplete, in that the region coding for the polypeptide is truncated with respect to 
the 5' terminus of the cDNA. This is a consequence of the reverse transcriptase, an enzyme 
with low 'processivity' (a measure of the ability of the enzyme to remain attached to the 

2 5 template during the polymerization reaction) employed during the first strand cDNA 

synthesis. 

There are several methods available and are well know to the skilled artisan to obtain 
full-length cDNAs, or extend short cDNAs, for example those based on the method of Rapid 
Amplification of cDNA Ends (RACE) (see, for example, Frohman et ai (1988) Proc. Natl. 

3 0 Acad. Sci. USA 85:8998-9002). Recent modifications of the technique, exemplified by the 

Marathon™ technology (Clonetech Laboratories, Inc.) for example, have significantly 
simplified obtaining full-length cDNA sequences. 
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Another aspect of the present invention relates to isolated acyltransferase 
polypeptides. Such polypeptides include isolated polypeptides set forth in the Sequence 
Listin. as well as polypeptides and fragments thereof, particularly those polypeptides wh,ch 
exhibit acyltransferase activity and also those polypeptides which have at least 50%, 60% or 
70% identity, preferably at least 80% identity, more preferably at least 90% identity, and most 
preferably at least 95% identity to a polypeptide sequence selected from the group of 
sequences set forth in the Sequence Listing, and also include portions of such polypeptides, 
where,n such portion of the polypeptide preferably includes a, least 30 amino acids and more 
preferably includes at least 50 amino acids. 

"Identity", as is well understood in the art. is a relationship between two or more 
polypeptide sequences or two or more polynucleotide sequences, as determined by comparing 
the sequences. In the art, "identity" also means the degree of sequence re.atedness between 
polypeptide or polynucleotide sequences, as determined by the match between strings of such 
sequences. "Identity" can be readily calculated by known methods inc.uding, but not limited 
to. those described in Computational Molecular Biology , Lesk. A.M., ed.. Oxford University 
Press, New York ( 1988V. Biocomputing: Informatics and Genome Projects. Smith, D.W.. ed., 
Academic Press, New York. 1993; Computer Analysis of Sequence Data, Part J, Griffin, 
A M and Griffin, H.G.. eds., Humana Press, New Jersey (1994); Sequence Analysis in 
Molecular Biology, von Heinje, G.. Academic Press (1987); Sequence Analysis Primer, 
Gribskov, M. and Devereux, J., eds., Stockton Press, New York (1991); and Canllo. H.. and 
Lipman, D„ SIAM J Applied Math, 48:1073 (1988). Methods to determine identity are 
designed to give the largest match between the sequences tested. Moreover, methods to 
determine identity are codified in publicly available programs. Computer programs which 
can be used to determine identity between two sequences include, but are not limited to, GCG 
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); suite of five BLAST 
programs, three designed for nucleotide sequences queries (BLASTN, BLASTX, and 
TBLASTX) and two designed for protein sequence queries (BLASTP and TBLASTN) 
(Coulson, Trends in Biotechnology, 12: 76-80 (1994); Birren, et al. Genome Analysis. 1: 
M3-559 (1997)) The BLAST X program is publicly available from NCBI and other sources 
(BLAST Manual, Altschul, S., et al, NCBI NLM NIH, Bethesda, MD 20894; Altschul. S.. et 
al, J. Mol Biol, 215-.403-410 (1990)). The well known Smith Waterman algorithm can also 
be used to determine identity. 
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Parameters for polypeptide sequence comparison typically include the following: 
Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970) 
Comparison matrix: BLOSSUM62 from Hentikoff and Hentikoff, Proc. Natl. Acad. 
Sci USA 89:10915-10919 (1992) 
5 Gap Penalty: 12 

Gap Length Penalty: 4 

A program which can be used with these parameters is publicly available as the "gap- 
program from Genetics Computer Group, Madison Wisconsin. The above parameters along 
with no penalty for end gap are the default parameters for peptide comparisons. 
1 o Parameters for polynucleotide sequence comparison include the following: 

Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970) 

Comparison matrix: matches = +10; mismatches = 0 

Gap Penalty: 50 

Gap Length Penalty: 3 

15 A program which can be used with these parameters is publicly available as the "gap- 

program from Genetics Computer Group, Madison Wisconsin. The above parameters are the 
default parameters for nucleic acid comparisons. 

The invention also includes polypeptides of the formula: 

X-(R,)n-(R2)-(R3)n-Y 

2 0 wherein, at the amino terminus, X is hydrogen, and at the carboxyl terminus, Y is hydrogen or 
a metal. R, and R 3 are any amino acid residue, n is an integer between 1 and 1000, and R 2 is 
an amino acid sequence of the invention, particularly an amino acid sequence selected from 
the group set forth in the Sequence Listing and preferably SEQ IDNOs: 2, 4, 6, 8, 11,13, 15, 
17, 19, 21, 23, and 218-225. In the formula, R 2 is oriented so that its amino terminal residue 
25 is at the left, bound to R„ and its carboxy terminal residue is at the right, bound to R,. Any 
stretch of amino acid residues denoted by either R group, where R is greater than 1, may be 
either a heteropolymer or a homopolymer, preferably a heteropolymer. 

Polypeptides of the present invention include isolated polypeptides encoded by a 
polynucleotide comprising a sequence selected from the group of a sequence contained in 
3 0 SEQ IDNOs: 1,3,5,7,9, 10, 12, 14, 16, 18, 20, 22, and 226-233. 

The polypeptides of the present invention can be mature protein or can be part of a 
fusion protein. 
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Fragments and variants of the polypeptides are also considered to be a part of the 
invention. A fragment is a variant polypeptide which has an amino acid sequence that is 
entirely the same as part but not all of the amino acid sequence of the previously described 
polypeptides. The fragments can be "free-standing" or comprised within a larger polypeptide 
5 of which the fragment forms a part or a region, most preferably as a single continuous region. 
Preferred fragments are biologically active fragments which are those fragments that mediate 
activities of the polypeptides of the invention, including those with stmilar activity or 
improved activity or with a decreased activity. Also included are those fragments that 
antigenic or immunogenic in an animal, particularly a human. 

1 o Variants of the polypeptide also include polypeptides that vary from the sequences set 

forth in the Sequence Listing by conservative amino acid substitutions, substitution of a 
residue by another with like characteristics. In general, such substitutions are among Ala, 
Val, Leu and He; between Ser and Thr; between Asp and Glu; between Asn and Gin: between 
Lys and Arg; or between Phe and Tyr. Particularly preferred are variants in which 5 to 10; 1 
15 to 5; 1 to 3 or one amino acid(s) are substituted, deleted, or added, in any combination. 

Variants that are fragments of the polypeptides of the invention can be used to 
produce the corresponding full length polypeptide by peptide synthesis. Therefore, these 
variants can be used as intermediates for producing the full-length polypeptides of the 
invention. 

2 0 The polynucleotides and polypeptides of the invention can be used, for example, in 

the transformation of various host cells, as further discussed herein. 

The invention also provides polynucleotides that encode a polypeptide that is a mature 
protein plus additional amino or carboxyl-terminal amino acids, or amino acids within the 
mature polypeptide (for example, when the mature form of the protein has more than one 

2 5 polypeptide chain). Such sequences can, for example, play a role in the processing of a 

protein from a precursor to a mature form, allow protein transport, shorten or lengthen protein 
half-life, or facilitate manipulation of the protein in assays or production. It is contemplated 
that cellular enzymes can be used to remove any additional amino acids from the mature 
protein. 

3 0 A precursor protein, having the mature form of the polypeptide fused to one or more 

prosequences may be an inactive form of the polypeptide. The inactive precursors generally 
are activated when the prosequences are removed. Some or all of the prosequences may be 
removed prior to activation. Such precursor protein are generally called proproteins. 
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The polynucleotide and polypeptide sequences can also be used to identify additional 
sequences which are homologous to the sequences of the present invention. The most 
preferable and convenient method is to store the sequence in a computer readable medium, 
for example, floppy disk, CD ROM, hard disk drives, external disk drives and DVD, and then 
to use the stored sequence to search a sequence database with well known searching tools. 
Examples of public databases include the DNA Database of Japan 
(DDBJ)(http://www.ddbj.nig.ac.jp/); Genebank 

(Ml -,-» n ^ a.w/w.h/fie.nhank/Index.htmll; and the European Molecular 

Biology Laboratory Nucleic Acid Sequence Database (EMBL) 

(hit^—1 j±i^^ Hn^mhl db.html). A number of different search algorithms are 
available to the skilled artisan, one example of which are the suite of programs referred to as 
BLAST programs. There are five implementations of BLAST, three designed for nucleotide 
sequences queries (BLASTN, BLASTX, and TBLASTX) and two designed for protein 
sequence queries (BLASTP and TBLASTN) (Coulson, Trends in Biotechnology, 12: 76-80 
(1994); Birren, er al. Genome Analysis, I: 543-559 (1997)). Additional programs are 
available in the art for the analysis of identified sequences, such as sequence alignment 
programs, programs for the identification of more distantly related sequences, and the like, 
and are well known to the skilled artisan. 
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Plant Constructs and Methods of Use 



Of interest in the present invention, is the use of the nucleotide sequences, or 
polynucleotides, in recombinant DNA constructs to direct the transcription or transcription 
and translation (expression) of the acyltransferase sequences of the present invent.on in a host 
cell. 

Of particular interest is the use of the nucleotide sequences, or polynucleotides, in 
recombinant DNA constructs to direct the transcription or transcription and translation 
(expression) of the acyltransferase sequences of the present invention in a host cell. The 
expression constructs generally comprise a promoter functional in a host cell operably linked 
to a nucleic acid sequence encoding an acyltransferase of the present invention and a 
transcriptional termination region functional in a host cell. 

By "host cell" is meant a cell which contains a vector and supports the replication, 
and/or transcription or transcription and translation (expression) of the expression construct. 
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Host cells for use in the present invention can be prokaryotic cells, such as£. coli, or 
eukaryot.c cells such as yeast, plant, insect, amphibian, or mammalian cells. Preferably, host 
cells are monocotyledenous or dicotyledenous plant cells. 

Of particular interest in the present invention is the use of the polynucleotides of the 
5 present invention for the preparation of constructs to direct the transcription or transcription 
and translation of the nucleotide sequences encoding an acyltransferase in a host plant cell. 
Plant expression constructs generally comprise a promoter functional in a plant host cell 
operably linked to a nucleic acid sequence of the present and a transcriptional termination 
region functional in a host plant cell. 
10 Those skilled in the art will recognize that there are a number of promoters which are 

functional in plant cells, and have been described in the literature. Chloroplast and plastid 
specific promoters, chloroplast or plastid functional promoters, and chloroplast or plastid 
operable promoters are also envisioned. 

One set of promoters are constitutive promoters such as the CaMV35S or FMV35S 
1 5 promoters that yield high levels of expression in most plant organs. Enhanced or duplicated 
versions of the CaMV35S and FMV35S promoters are useful in the practice of this invention 
(Odell. etal. (1985) Nature 313:810-812; Rogers, U.S. Patent Number 5,378, 619). In 
addition, it may also be preferred to bring about expression of the protein of interest in 
specific tissues of the plant, such as leaf, stem, root, tuber, seed, fruit, etc., and the promoter 
2 0 chosen should have the desired tissue and developmental specificity. 

Of particular interest is the expression of the nucleic acid sequences of the present 
invention from transcription initiation regions which are preferentially expressed in a plant 
seed tissue. Examples of such seed preferential transcription initiation sequences include 
those sequences derived from sequences encoding plant storage protein genes or from genes 

2 5 involved in fatty acid biosynthesis in oilseeds. Examples of such promoters include the 5' 

regulatory regions from such genes as napin (Kridl etal. Seed Sci. Res. 7:209:219 (1991)), 
phaseolin. zein, soybean trypsin inhibitor, ACP, stearoyl-ACP desaturase, soybean a' subunit 
of P-conglycinin (soy 7s. (Chen et al. Proc. Narl. Acad. Sci., 83:8560-8564 ( 1986))) and 
oleosin. 

3 o It may be advantageous to direct the localization of proteins conferring acyltransferase 

to a particular subcellular compartment, for example, to the mitochondrion, endoplasmic 
reticulum, vacuoles, chloroplast or other plastidic compartment. For example, where the 
genes of interest of the present invention will be targeted to plastids, such as chloroplasts. for 
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expression, the constructs will also employ the use of sequences to direct the gene to the 
plastid. Such sequences are referred to herein as chloroplast transit peptides (CTP) or plastid 
transit peptides (PTP). In this manner, where the gene of interest is not directly inserted into 
the plastid, the expression construct will additionally contain a gene encoding a transit 
peptide to direct the gene of interest to the plastid. The chloroplast transit peptides may be 
derived from the gene of interest, or may be derived from a heterologous sequence having a 
CTP. Such transit peptides are known in the art. See, for example, Von Heijne et al (1991) 
Plant Moi Biol. Rep. 9:104-126; Clark et al (1989) J. Biol Chem. 264:17544-17550; della- 
Cioppa et al (1987) Plant Physiol 84:965-968; Romer et al (1993) Biochem. Biophys. Res 
Commitn. 196: 1414-1421; and, Shah et al (1986) Science 253:478-481. Additional transit 
peptides for the translocation of the protein to the endoplasmic reticulum (ER), or vacuole 
may also find use in the constructs of the present invention. 

Depending upon the intended use, the constructs may contain the nucleic acid 
sequence which encodes the entire acyltransferase protein, or a portion thereof. For example, 
where antisense inhibition of a given acyltransferase protein is desired, the entire sequence is 
not required. Furthermore, where acyltransferase sequences used in constructs are intended 
for use as probes, it may be advantageous to prepare constructs containing only a particular 
portion of a acyltransferase encoding sequence, for example a sequence which is discovered 
to encode a highly conserved acyltransferase region. 

The skilled artisan will recognize that there are various methods for the inhibition of 
expression of endogenous sequences in a host cell. Such methods include, but are not limited 
to antisense suppression (Smith, et al (1988) Nature 334:724-726) , co-suppression (Napoli, 
etal ( 1989) Plant Cell 2:279-289), ribozymes (PCT Publication WO 97/10328), and 
combinations of sense and antisense, such as those described by Waterhouse, et al (1998) 
Proc, Natl Acad. Sci. USA 95:13959-13964. Methods for the suppression of endogenous 
sequences in a host cell typically employ the transcription or transcription and translation of 
at least a portion of the sequence to be suppressed. Such sequences may be homologous to 
coding as well as non-coding regions of the endogenous sequence. 

Regulatory transcript termination regions may be provided in plant expression 
constructs of this invention as well. Transcript termination regions may be provided by the 
DNA sequence encoding the acyltransferase or a convenient transcription termination region 
derived from a different gene source, for example, the transcript termination region which is 
naturally associated with the transcript initiation region. The skilled artisan will recognize 
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that any convenient transcript termination region which is capable of terminating transcription 
in a plant cell may be employed in the constructs of the present invention. 

Alternatively, constructs may be prepared to direct the expression of the 
acyltransferase sequences directly from the host plant cell plastid. Such constructs and 
methods are known in the art and are generally described, for example, in Svab, et al. (1990) 
Proc. Natl. Acad. Sci. USA 87:8526-8530 and Svab and Maliga (1993) Proc. Natl. Acad. Sci. 
USA 90:913-917 and in U.S. Patent Number 5,693,507. 

A plant cell, tissue, organ, or plant into which the recombinant DNA constructs 
containing the expression constructs have been introduced is considered transformed, 
transfected, or transgenic. A transgenic or transformed cell or plant also includes progeny of 
the cell or plant and progeny produced from a breeding program employing such a transgenic 
plant as a parent in a cross and exhibiting an altered genotype resulting from the presence of 
an introduced acyltransferase nucleic acid sequence. 

The term "introduced" in the context of inserting a nucleic acid sequence into a cell, 
means "transfection", or "transformation" or "transduction" and includes reference to the 
incorporation of a nucleic acid sequence into a eukaryotic or prokaryotic cell where the 
nucleic acid sequence may be incorporated into the genome of the cell (for example, 
chromosome, plasmid, plastid, or mitochondrial DNA), converted into an autonomous 
replicon, or transiently expressed (for example, transfected mRNA). 

Plant expression or transcription constructs having an acyltransferase as the DNA 
sequence of interest for increased or decreased expression thereof may be employed with a 
wide variety of plant life, particularly, plant life involved in the production of vegetable oils 
for edible and industrial uses. Plants of interest in the present invention include 
monocotyledenous and dicotyledenous plants. Most especially preferred are temperate 
oilseed crops. Plants of interest include, but are not limited to, rapeseed (Canola and High 
Erucic Acid varieties), sunflower, safflower, cotton, soybean, peanut, coconut and oil palms, 
and corn. Depending on the method for introducing the recombinant constructs into the host 
cell, other DNA sequences may be required. Importantly, this invention is applicable to 
dicotyledyons and monocotyledons species alike and will be readily applicable to new and/or 
improved transformation and regulation techniques. 

As used herein, the term "plant' 1 includes reference to whole plants, plant organs (for 
example, leaves, stems, roots, etc.), seeds, and plant cells and progeny of same. Plant cell, as 
used herein includes, without limitation, seeds suspension cultures, embryos, menstematic 
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regions, callus tissue, leaves roots shoots, gametophytes, sporophytes, pollen, and 
microspores. The class of plants which can be used in the methods of the present invention is 
generally as broad as the class of higher plants amenable to transformation techniques, 
including both monocotyledenous and dicotyledenous plants. Particularly preferred plants of 
interest include, but are not limited to, rapeseed (Canola and High Erucic Acid varieties), 
sunflower, safflower, cotton, soybean, peanut, coconut and oil palms, and corn. Most 
especially preferred plants include Brassica, soybean, and corn. 

As used herein, "transgenic plant" includes reference to a plant which comprises 
within its genome a heterologous polynucleotide. Generally, the heterologous polynucleotide 
is stably integrated within the genome such that the polynucleotide is passed on to successive 
generations. The heterologous polynucleotide may be integrated into the genome alone or as 
part of a recombinant expression cassette. "Transgenic" is used herein to include any cell, 
cell line, callus, tissue, plant part or plant, the genotype of which has been altered by the 
presence of heterologous nucleic acid including those transgenics initially so altered as well 
as those created by sexual crosses or asexual propagation from the initial transgenic. 

Thus a plant having within its cells a heterologous polynucleotide is referred to herein 
as a transgenic plant. The heterologous polynucleotide can be either stably integrated into the 
genome, or can be extra-chromosomal. Preferably, the polynucleotide of the present 
invention is stably integrated into the genome such that the polynucleotide is passed on to 
successive generations. The polynucleotide is integrated into the genome alone or as part of a 
recombinant expression cassette. "Transgenic" is used herein to include any cell, cell line, 
callus, tissue, plant part or plant, the genotype of which has been altered by the presence of 
heterologous nucleic acids including those transgenics initially so altered as well as those 
created by sexual crosses or asexual reproduction of the initial transgenics. 

As used herein, "heterologous" in reference to a nucleic acid is a nucleic acid that 
originates from a foreign species, or, if from the same species, is substantially modified from 
its native form in composition and/or genomic locus by deliberate human intervention. For 
example, a promoter operably linked to a heterologous structural gene is from a species 
different from that from which the structural gene was derived, or, if from the same species, 
one or both are substantially modified from their original form. A heterologous protein may 
originate from a foreign species, or, if from the same species, is substantially modified from 
its original form by deliberate human intervention. 
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As used herein, a "'recombinant expression cassette" is a nucleic acid construct, 
generated recombinantly or synthetically, with a series of specified nucleic acid elements 
which permit transcription of a particular nucleic acid in a target cell. The recombinant 
expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA, 
5 plastid DNA, virus, or nucleic acid fragment. Typically, the recombinant expression cassette 
portion of an expression vector includes, among other sequences, a nucleic acid sequence to 
be transcribed and a promoter. 

It is contemplated that the gene sequences may be synthesized, either completely or in 
part, especially where it is desirable to provide plant-preferred sequences. Thus, all or a 

1 0 portion of the desired structural gene (that portion of the gene which encodes the 

acyltransferase protein) may be synthesized using codons preferred by a selected host. Host- 
preferred codons may be determined, for example, from the codons used most frequently in 
the proteins expressed in a desired host species. 

One skilled in the art will readily recognize that antibody preparations, nucleic acid 
15 probes (DNA and RNA) and the like may be prepared and used to screen and recover 

"homologous" or "related" acyltransferase from a variety of plant sources. Homologous 
sequences are found when there is an identity of sequence, which may be determined upon 
comparison of sequence information, nucleic acid or amino acid, or through hybridization 
reactions between a known acyltransferase and a candidate source. Conservative changes, 

2 0 such as Glu/Asp, Val/Ile, Ser/Thr, Arg/Lys and Gln/Asn may also be considered in 

determining sequence homology. Amino acid sequences are considered homologous by as 
little as 25% sequence identity between the two complete mature proteins. (See generally, 
Doolittle, R.F., OF URFS and ORFS (University Science Books, CA, 1986.) 

Thus, other acyltransferase sequences can be obtained from the specific exemplified 

2 5 sequences provided herein. Furthermore, it will be apparent that one can obtain natural and 

synthetic sequences, including modified amino acid sequences and starting materials for 
synthetic-protein modeling from the exemplified sequences and from acyltransferases which 
are obtained through the use of such exemplified sequences. Modified amino acid sequences 
include sequences which have been mutated, truncated, increased and the like, whether such 

3 0 sequences were partially or wholly synthesized. Sequences which are actually purified from 

plant preparations or are identical or encode identical proteins thereto, regardless of the 
method used to obtain the protein or sequence, are equally considered naturally derived. 
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For immunological screening, antibodies to the acyltransferase protein can be 
prepared by injecting rabbits or mice with the purified protein or portion thereof, such 
methods of preparing antibodies being well known to those in the art. Either monoclonal or 
polyclonal antibodies can be produced, although typically polyclonal antibodies are more 
5 useful for gene isolation. Western analysis may be conducted to determine that a related 
protein is present in a crude extract of the desired plant species, as determined by cross- 
reaction with the antibodies to the acyltransferase protein. When cross-reactivity is observed, 
genes encoding the related proteins are isolated by screening expression libraries representing 
the desired plant species. Expression libraries can be constructed in a variety of commercially 

10 available vectors, including lambda gtl 1, as described in Sambrook, et ai (Molecular 

Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor Laboratory, Cold 
Spring Harbor, New York). 

The nucleic acid sequences associated with acyltransferase proteins will find many 
uses. For example, recombinant constructs can be prepared which can be used as probes, or 

15 which will provide for expression of the acyltransferase protein in host cells to produce a 

ready source of the enzyme and/or to modify the composition of triglycerides found therein. 
Other useful applications may be found when the host cell is a plant host cell, either in vitro 
or in vivo. 

The modification of fatty acid compositions may also affect the fluidity of plant 
2 0 membranes. Different lipid concentrations have been observed in cold-hardened plants, for 
example. By this invention, one may be capable of introducing traits which will lend to chill 
tolerance. Constitutive or temperature inducible transcription initiation regulatory control 
regions may have special applications for such uses. 

As discussed above, nucleic acid sequence encoding an acyltransferase of this 

2 5 invention may include genomic, cDNA or mRNA sequence. By "encoding" is meant that the 

sequence corresponds to a particular amino acid sequence either in a sense or anti-sense 
orientation. By "extrachromosomar is meant that the sequence is outside of the plant 
genome of which it is naturally associated. By "recombinant" is meant that the sequence 
contains a genetically engineered modification through manipulation via mutagenesis, 

3 0 restriction enzymes, and the like. 

Once the desired acyltransferase nucleic acid sequence is obtained, it may be 
manipulated in a variety of ways. Where the sequence involves non-coding flanking regions, 
the flanking regions may be subjected to resection, mutagenesis, etc. Thus, transitions, 
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transversions, deletions, and insertions may be performed on the naturally occurring 
sequence. In addition, all or part of the sequence may be synthesized. In the structural gene, 
one or more codons may be modified to provide for a modified amino acid sequence, or one 
or more codon mutations may be introduced to provide for a convenient restriction site or 
5 other purpose involved with construction or expression. The structural gene may be further 
modified by employing synthetic adapters, linkers to introduce one or more convenient 
restriction sites, or the like. 

The nucleic acid or amino acid sequences encoding an acyltransferase of this 
invention may be combined with other non-native, or "heterologous", sequences in a variety 

10 of ways. By "heterologous" sequences is meant any sequence which is not naturally found 
joined to the acyltransferase, including, for example, combinations of nucleic acid sequences 
from the same plant which are not naturally found joined together. 

The DNA sequence encoding an acyltransferase of this invention may be employed in 
conjunction with all or part of the gene sequences normally associated with the 

15 acyltransferase. In its component parts, a DNA sequence encoding acyltransferase is 

combined in a DNA construct having, in the 5' to 3 1 direction of transcription, a transcription 
initiation control region capable of promoting transcription and translation in a host cell, the 
DNA sequence encoding plant acyltransferase and a transcription and translation termination 
region. 

2 0 Potential host cells include both prokaryotic cells, such as E.coli and eukaryotic cells 

such as yeast, insect, amphibian, or mammalian cells. A host cell may be unicellular or found 
in a multicellular differentiated or undifferentiated organism depending upon the intended 
use. Preferably, host cells of the present invention include plant cells, both 
monocotyledenous and dicotyledenous. Cells of this invention may be distinguished by 

2 5 having a sequence foreign to the wild-type cell present therein, for example, by having a 

recombinant nucleic acid construct encoding an acyltransferase therein. 

The methods used for the transformation of the host plant cell are not critical to the 
present invention. The transformation of the plant is preferably permanent, i.e. by integration 
of the introduced expression constructs into the host plant genome, so that the introduced 

3 0 constructs are passed onto successive plant generations. The skilled artisan will recognize 

that a wide variety of transformation techniques exist in the art, and new techniques are 
continually becoming available. Any technique that is suitable for the target host plant can be 
employed within the scope of the present invention. For example, the constructs can be 
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introduced in a variety of forms including, but not limited to as a strand of DNA, in a 
plasmid, or in an artificial chromosome. The introduction of the constructs into the target 
plant cells can be accomplished by a variety of techniques, including, but not limited to 
calcium-phosphate-DNA co-precipitation, electroporation, microinjection, Agrobacterium 
5 infection, liposomes or microprojectile transformation. The skilled artisan can refer to the 
literature for details and select suitable techniques for use in the methods of the present 
invention. 

Normally, included with the DNA construct will be a structural gene having the 
necessary regulatory regions for expression in a host and providing for selection of 

10 transformant cells. The gene may provide for resistance to a cytotoxic agent, e.g. antibiotic, 
heavy metal, toxin, etc., complementation providing prototrophy to an auxotrophic host, viral 
immunity or the like. Depending upon the number of different host species the expression 
construct or components thereof are introduced, one or more markers may be employed, 
where different conditions for selection are used for the different hosts. 

15 Where Agrobacterium is used for plant cell transformation, a vector may be used 

which may be introduced into the Agrobacterium host for homologous recombination with T- 
DNA or the Ti- or Ri-plasmid present in the Agrobacterium host. The Ti- or Ri-plasmid 
containing the T-DNA for recombination may be armed (capable of causing gall formation) 
or disarmed (incapable of causing gall formation), the latter being permissible, so long as the 

2 0 vir genes are present in the transformed Agrobacterium host. The armed plasmid can give a 
mixture of normal plant cells and gall. 

In some instances where Agrobacterium is used as the vehicle for transforming host 
plant cells, the expression or transcription construct bordered by the T-DNA border region(s) 
will be inserted into a broad host range vector capable of replication in E. coli and 

2 5 Agrobacterium, there being broad host range vectors described in the literature. Commonly 

used is pRK2 or derivatives thereof. See, for example, Ditta, et al„ (Proc. Nat. Acad. Sci., 
U.S.A. (1980) 77:7347-7351) and EPA 0 120 515, which are incorporated herein by reference. 
Alternatively, one may insert the sequences to be expressed in plant cells into a vector 
containing separate replication sequences, one of which stabilizes the vector in E. coli, and 

3 0 the other in Agrobacterium. See, for example, McBride and Summerfelt (Plant Mol. Biol. 

(1990) 74:269-276), wherein the pRiHRI (Jouanin, et aL Mol. Gen. Genet. (1985) 201:370- 
374) origin of replication is utilized and provides for added stability of the plant expression 
vectors in host Agrobacterium cells. 
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Included with the expression construct and the T-DNA will be one or more markers, 
which allow for selection of transformed Agrobacterium and transformed plant cells. A 
number of markers have been developed for use with plant cells, such as resistance to 
chloramphenicol, kanamycin, the aminoglycoside G418, hygromycin. or the like. The 
5 particular marker employed is not essential to this invention, one or another marker being 
preferred depending on the particular host and the manner of construction. 

For transformation of plant cells using Agrobacterium. explants may be combined and 
incubated with the transformed A grobacterium for sufficient time for transformation, the 
bacteria killed, and the plant cells cultured in an appropriate selective medium. Once callus 

10 forms, shoot formation can be encouraged by employing the appropriate plant hormones in 
accordance with known methods and the shoots transferred to rooting medium for 
regeneration of plants. The plants may then be grown to seed and the seed used to establish 
repetitive generations and for isolation of vegetable oils. 

There are several possible ways to obtain the plant cells of this invention which 

15 contain multiple expression constructs. Any means for producing a plant comprising a 
construct having a nucleic acid sequence of the present invention, and at least one other 
construct having another DNA sequence encoding an enzyme are encompassed by the present 
invention. For example, the expression construct can be used to transform a plant at the same 
time as the second construct either by inclusion of both expression constructs in a single 

2 0 transformation vector or by using separate vectors, each of which express desired genes. The 
second construct can be introduced into a plant which has already been transformed with the 
first expression construct, or alternatively, transformed plants, one having the first construct 
and one having the second construct, can be crossed to bring the constructs together in the 
same plant. 

2 5 In general, acyltransferase proteins are active in the transfer of acyl groups from a 

donor to a variety of different substrates. For example, diacylglycerol acyltransferases add 
acyl groups to diacylglycerol to form triacylglycerol (TAG), oracyl:CoA:cholesterol 
acyltransferase uses an acyl-CoA as a donor to transfer an acyl group to a sterol to form a 
sterol ester. Typically, the substrates include, but are not limited to glycerides, including 

3 0 mono and diglycerides, sterols, stanols, phosphatides, and the like. Donors include, but are 

not limited to acyl-CoA and acyl-ACP molecules. 
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The invention now being generally described, it will be more readily understood by 
reference to the following examples which are included for purposes of illustration only and 
are not intended to limit the present invention. 



5 

EXAMPLES 

Example 1: RNA Isolations 

10 Total RNA from the inflorescence and developing seeds of Arabidopsis thaliana is 

isolated for use in construction of complementary (cDNA) libraries. The procedure is an 
adaptation of the DNA isolation protocol of Webb and Knapp (D.M. Webb and S.J. Knapp, 
(1990) Plant Molec. Reporter, 8, 180-185). The following description assumes the use of lg 
fresh weight of tissue. Frozen seed tissue is powdered by grinding under liquid nitrogen. The 

15 powder is added to 10ml REC buffer (50mM Tris-HCl, pH 9, 0.8M NaCl, lOrnM EDTA, 
0.5% w/v CTAB (cetyltrimethyl-ammonium bromide)) along with 0.2g insoluble 
polyvinylpolypyrrolidone, and ground at room temperature. The homogenate is centrifuged 
for 5 minutes at 12,000 xg to pellet insoluble material. The resulting supernatant fraction is 
extracted with chloroform, and the top phase is recovered. 

2 0 The RNA is then precipitated by addition of 1 volume RecP (50mM Tris-HCL pH9, 

lOmM EDTA and 0.5% (w/v) CTAB) and collected by brief centrifugation as before. The 
RNA pellet is redissolved in 0.4 ml of 1M NaCl. The RNA pellet is redissolved in water and 
extracted with phenol/chloroform. Sufficient 3M potassium acetate (pH 5) is added to make 
the mixture 0.3M in acetate, followed by addition of two volumes of ethanol to precipitate the 

2 5 RNA. After washing with ethanol, this final RNA precipitate is dissolved in water and stored 
frozen. 

Alternatively, total RNA may be obtained using TRIzol reagent (BRL- 
Lifetechnologies, Gaithersburg, MD) following the manufacturers protocol. The RNA 
precipitate is dissolved in water and stored frozen. 

30 



Example 2: Identification of Acyltransferase Homology Sequences 
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Searches are performed on a Silicon Graphics Unix computer using additional 
Bioaccellerator hardware and GenWeb software supplied by Compugen Ltd. This software 
and hardware enables the use of the Smith-Waterman algorithm in searching DNA and 
5 protein databases using profiles as queries. The program used to query protein databases is 
profilesearch. This is a search where the query is not a single sequence but a profile based on 
a multiple alignment of ammo acid or nucleic acid sequences. The profile is used to query a 
sequence data set. i.e.. a sequence database. The profile contains all the pertinent information 
for scoring each position in a sequence, in effect replacing the "scoring matrix" used for the 

10 standard query searches. The program used to query nucleotide databases with a protein 

profile is tprofilesearch. Tprofilesearch searches nucleic acid databases using an amino acid 
profile query. As the search is running, sequences in the database are translated to amino acid 
sequences in six reading frames. The output file for tprofilesearch is identical to the output 
file for profilesearch except for an additional column that indicates the frame in which the 

15 best alignment occurred. 

The Smith-Waterman algorithm, (Smith and Waterman (1981) supra), is used to 
search for similarities between one sequence from the query and a group of sequences 
contained in the database. E score values as well as other sequence information, such as 
conserved peptide sequences of HXXXXD and PEG are used to identify related sequences. 

2 0 By using the conserved peptide sequence information, E score values of greater than E-12 and 
E-8 are considered. For example, the EST sequence originally used to identify ATAT2 had 
an E score of 0.0094, while the EST sequence originally used to identify ATLPAAT1 had an 
E score of 0.0868. 

A protein sequence of glycerol-3-phosphate from£. coli (Swiss Prot Accession 

2 5 P00482) is used to search the NCBI non-redundant protein database using BLAST. In the 

first round of searches, other membrane forms of G3PAAT are identified. In subsequent PSI- 
BLAST searches (Altschul, et al. (1997) Nucleic Acids Res 25:3389-3402), LPAATs and 
other acyltransferases are identified. Using sequence alignment software programs, G3PAAT 
and different LPAAT amino acid sequences are aligned, and a profile is generated using a 

3 0 homologous sequence region, between amino acids 256 and 459 of theii. coli sequence. 

The identified 204 amino acid is used to query the protein database using PSI-BLAST. 
After 5 iterations of PSI-BLAST. the profile generated from this new query (Figure 1) 
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identified soluble forms of G3PAAT. Prior to this identification, no sequence homology had 
been identified between the membrane and soluble forms of G3PAAT. 

5 Example 3: Excision of PSI-BLAST Profile 

The profile generated from the queries using PSI-BLAST is excised from the hyper 
text markup language (html) file. The worldwide web (www)/html interface to psiblast at 
ncbi stores the current generated profile matrix in a hidden field in the html file that is 
10 returned after each iteration of psiblast. However, this matrix has been encoded into string62 
(s62) format for ease of transport through html. String62 format is a simple conversion of the 
values of the matrix into html legal ascii characters. 

The encoded matrix width (x axis) is 26 characters, and comprise the consensus 
characters, the probabilities of each amino acid in the order A,B,C,D,E,F,G,H,I,K,L,M,N, 
15 P,Q,R,S,T,V,W,X,Y,Z (where B represents D and N, and Z represents Q and E, and X 
represents any amino acid), gap creation value, and gap extension value. 

The length (y axis ) of the matrix corresponds to the length of the sequences identified 
by PSI-BLAST. The order of the amino acids corresponds to the conserved amino acid 
sequence of the sequences identified using PSI-BLAST, with the N-terminal end at the top of 
2 0 the matrix. The probabilities of other amino acids at that position are represented for each 
amino acid along the x axis, below the respective single letter amino acid abbreviation. 

Thus, each row of the profile consists of the highest scoring (consensus) amino acid, 
followed by the scores for each possible amino acid at that position in sequence matrix, the 
score for opening a gap that that position, and the score for continuing a gap at that position. 

2 5 The string62 file is converted back into a profile for use in subsequent searches. The 

gap open field is set to 1 1 and the gap extension field is set to 1 along the x axis. The gap 
creation and gap extension values are known, based on the settings given to the PSI-BLAST 
algorithm. The matrix is exported to the standard GCG profile form. This format can be read 
by GenWeb. 

3 0 The algorithm used to convert the string62 formatted file to the matrix is outlined in 

Table 1. 
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Table 1 



1. if encoded character z then the value is blast score min 

2. if encoded character Z then the value is blast score max 

3. else if the encoded character is uppercase then its value is (64-(ascii # of char)) 

4. else if the encoded character is a digit the value is ((ascii # of char)-48) 

5. else if the encoded character is not uppercase then the value is ((ascii # of char) - 87) 

6. ALL B positions are set to min of D and N amino acids at that row in sequence matrix 

7. ALL Z positions are set to min of Q amd E ammo acids at that row in sequence matrix 

8. ALL X positions are set to min of all amino acids at that row in sequence matrix 

9. kBLAST_SCORE_MAX=999; 

10. kBLAST_SCORE_MIN=-999; 

11. all gap opens are set to 1 1 

12. all gap lens are set to 1 



15 



Example 4: Identification of Novel Acyltransferase Related Amino Acid Sequences 

2 0 The profile (Figure 1) is used in further queries to identify a number of previously 

unidentified proteins from yeast as novel acyltransferases. A protein is identified from an 
Arahidopsis protein sequence database (AT ATI) (SEQ ID NO:2). Sequences are also 
identified from nucleic acid databases (Table 2) 

2 5 Table 2 



Database ID Number 


BLAST Search Hits 


Log probability 


Saccharomxces cerevisiae 






gi 1078509 


Limnanthes putative LPAAT 


e-10(SEQ ID 


NO:217) 






gi 586485 


Limnanthes putative LPAAT 


e-13 (SEQ ID 


NO:218) 
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PCT/US99/22231 


gi 320748 


Limnanthes putative LPAAT 


e- 19 (SEQ ID 


NO:219) 






gi 2506920 


SUPPRESSES CTR1 (choline transport mutant) (SEQ ID NO:220) 


g, I J47U- / 


similar to CTR1 


e-1 18 (SEQ ID 








cri ? 1 1 


unidentified 


( SEQ ID 










unidentified 


(SEQ ID 


NO:223) 






gi 2132299 


TAFAZZIN 


e-14 (SEQ ID 


NO:224) 







In Table 2, the gi number is the database identifier, the middle column shows the 
results of BLAST searches against the NCBI NR protein database, and the log probability 
number shows represents the log of the probability of such a match occurring by random 
chance. These proteins, including the ATAT1 protein sequence, are identified using the 
original PSI-BLAST search of the NCBI NR protein database. Thus, these proteins are novel 
acyltransferase related proteins with unidentified activities. 

The Arabidopsis acyltransferase sequence, herein referred to as ATATl, is also 
identified using the original PSI-BLAST search of the NCBI NR protein database, and did not 
have an annotated function. 

Additional Arabidopsis amino acid sequences related to acyltransferases are identified 
from the databases, referred to as ATAT2est, ATAT3est, ATAT4est, ATATSest, ATAT6est, 
ATAT7est, ATAT8est, ATAT9, AT AT 10, and ATATl lest. Furthermore, Arabidopsis 
amino acid sequences are identified which demonstrate sequence similarity to known 
lysophosphatidic acid, referred to as ATLPAAT1. The sequences of ATAT9 and AT AT 10 
are identified from the database as genomic sequences, all other Arabidopsis sequences are 
identified as ESTs. 
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To obtain the entire coding region corresponding to the A rabidopsis acyltransferase 
sequences, synthetic oligo-nucleotide primers are designed to amplify the 5' and 3' ends of 
partial cDNA clones containing acyltransferase related sequences. Primers are designed 
according to the respective ,4 rabidopsis acyltransferase related sequences (Table 3) and used 
5 in Rapid Amplification of cDNA Ends (RACE) reactions (Frohman et al. (1988) Proc. Natl. 
Acad. Sci. USA 85:8998-9002) using the Marathon cDNA amplification kit (Clontech 
Laboratories Inc, Palo Alto, CA). Primers with an R designation are used for 5' RACE 
reactions, and primers with an F designation are used for 3' RACE reactions. 
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Table 3 



ATAT2 



ATAT2R1 CCATCCGCTTCAAGGGAACGACACCCATCA (SEQ ID NO:135) 

ATAT2R2 TCCCTGTCTTGCTTGATGAACTTAAAGCTTG (SEQ ID NO: 136) 

ATAT2R3 ACAGC AGG AGTGTCTGATG ATGGCAGATTC (SEQ ID NO: 1 37 ) 

ATAT3 



ATAT3R 1 ACTGGAGTTCCAGCCAAAAATGCACCTGTC < SEQ ID NO: 138) 
ATAT3R2 G ATAC ACCCTTG A A ATCAGGCG ATTTTGCT ( SEQ ID NO: 1 39) 



ATAT4 



ATAT4R] TTGCAAATTCAATTCCTGTTTCACCGGGCC (SEQ ID NO: 140) 
ATAT4R2 GTTTTCTGCTATTCCAGAAGGCGTCAACAA (SEQ ID NO: 141 ) 

AT ATS 



ATAT5R1 CATTGAAGATCCGTCCGTGAAGTTNCCTTACC (SEQ ID NO: 142) 

ATAT5R2 TCGAGCTGTGATCGATGATTGGCTGTGAAG (SEQ ID NO: 143 ) 

ATAT5F1 GTCTCTTCAAAAACACACACACACGTCTCT (SEQ ID NO: 144) 

ATAT5F2 GTCTCTTCAAAAACACACACACACGTCTCT (SEQ ID NO: 145 ) 

ATAT6 



H76348-F1 GTAGAGAGCCTTACTTGCTTCGGTTTAGTC (SEQ ID NO: 146) 

H76348-F2 ACGTCATCGTACCTGTTGCTATTGACTCAC (SEQ ID NO: 147) 

H76348-R1 ACTTTTCCATTGTCAGGGACTCCTCGACAC (SEQ ID NO: 148) 

H76348-R2 ACGGTGTAGGAAGGGAAAGGATTCAAAAGG (SEQ ID NO: 149) 

ATAT7 



ATTS0193-F1 GCGATGAACTACAGAGTCGGATTCTTCCTC (SEQ IDNO:150) 
ATTS0193-F2 CCGGTTTACGAGATTACGTTCTTGAACCAG (SEQ ID NO: 151 ) 
ATTSO 1 93-R 1 C A ATGGAG ACA AGGCTCG A A AGTGCTAACC (SEQ ID NO: 1 52) 
ATTS0193-R2 ATTCTCTGAACATAGTTCGCCACGGTCATG (SEQ ID NO: 153) 
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ATAT8 






AA042618- 


Fl GAAATCCAACGCCTTCCCAATATCACTCTG (SEQ ID NO: 154) 


AA042618- 


F^> fTTfA A PTT I PP A Tf A PiPt ATPTTPtPtP A rGT 


( SFD TD NO- 1 SS) 
\ _j i^y^s llj i n v_/. i _> _j y 


AA042618- 


R 1 APPAPTTPTT APAP APPTTAPPTOPTT AP1G 


f <\FO TD NO- 1 56"i 


AA042618- 


do TPPT APPT AP APP ATPP A ATTTPTPO APPP 


t SFO TD NO- 1 57"i 


ATATl l 






AT ATI l Rl 


CTGCGTCAAGTGAGCAACTCAGTTCTTGCA 


(SEQ ID NO: 158) 


ATATl 1R2 


TGGGAAGCAGCACGTTGTTCAGTATCGGAA 


(SEQ ID NO: 159) 


ATATl 1R3 


TAGCCTCTGTGTAATCTGTGCCCTCGGGGA 


(SEQ ID NO: 160) 



From the nucleic acid sequences obtained from the RACE reactions, protein sequence 
is predicted for each nucleic acid sequence using Macvector software. Nucleic acid sequences 
15 are provided for ATATl (SEQ ID NO:l), ATAT2 (SEQ ID NO:3), ATAT3 (SEQ ID NO:5), 
ATAT4 (SEQ ID NO:7), ATAT5 (SEQ ID NO:9), ATAT6 (SEQ ID NO: 10), ATAT7 (SEQ 
ID NO: 12), ATAT8 (SEQ ID NO: 14), ATAT9 (SEQ ID NO: 16), ATATl 0 (SEQ ID NO: 18), 
ATATl 1 (SEQ ID NO:20) and ATLPAAT1 (SEQ ID NO:22) t respectively. 

The protein sequence derived from the ATATl (SEQ ID NO: 2) nucleic acid sequence 
2 0 from Arabidopsis has a predicted molecular mass of 32.5 kDa, and a PI of 9.74. Alignment 
of the Arabidopsis acyltransferase with several LPAAT and G3PAAT shows that some of the 
domains that are conserved between LPAAT and G3PAAT are conserved in the new 
acyltransferase protein. 

The ATAT2 nucleic acid sequence is predicted to encode a 312 amino acid protein 

2 5 (SEQ ID NO:4), with a molecular weight of 34.6 kD, and a pi of 9.99. The ATAT2 protein 

may also contain 2 to 3 transmembrane domains. However, the protein encoded by the 
ATAT2 nucleic acid sequence may be longer than predicted because of the absence of an 
inframe stop codon upstream of the ATG start codon used. 

The ATAT3 nucleic acid sequence is predicted to encode a 398 amino acid protein 

3 0 (SEQ ID NO:6), with a molecular weight of 44.7 kD, and a pi of 5.62. The ATAT3 protein 

may contain 1 to 4 transmembrane domains. The ATAT4 nucleic acid sequence is predicted 
to encode a 3 17 amino acid protein (SEQ ID NO:8), with a molecular weight of 36.5 kD, and 
a pi of 9.67. The ATAT4 protein is predicted to have 2 to 5 transmembrane domains. 
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The ATLPAAT1 nucleic acid sequence is predicted to encode a 389 amino acid 
protein (SEQ ID NO:23), with a molecular weight of 43.7 kD, and a pi of 9.52. The 
ATLPAAT1 protein is predicted to have up to 3 transmembrane domains. The protein 
predicted from the ATLPAAT1 nucleic acid sequence is similar toLPAATs reported for 
5 Brassica, maize, and meadowfoam (described in PCT Publication WO 94/13814). The 
AT ATI 1 nucleic acid sequence is predicted to encode a 375 amino acid protein (SEQ ID 
NO:21), with a molecular weight of 43.5 kD, and a pi of 9.45. The deduced ammo acid 
sequences of ATAT6 (SEQ ID NO: 1 1), ATAT7 (SEQ ID NO; 13), ATAT8 (SEQ ID NO: 15), 
ATAT9 (SEQ ID NO: 17), and AT AT 1 0 (SEQ ID NO: 19) are also provided 

10 A sequence region approximately 30 amino acids upstream through approximately 

100 amino acids downstream of the conserved amino acid sequences HXXXXD (Heath and 
Rock, (1998) J. BactenoL 180(6): 1425-1430) and PEG (Neuwald (1997) Curr Biol 7:R465- 
R466) of the predicted amino acid sequences derived from the nucleic acid sequences of 
AT ATI, ATAT2, ATAT3, ATAT4, ATAT6, ATAT7, ATAT8, ATAT9, AT AT 10, 

15 ATLPAAT1, and ATAT1 1 are compared to the amino acid sequences of lysophosphatidic 
acid acyltransferase (Jojoba AT (SEQ ID NO: 162, the nucleic acid sequence is provided in 
SEQ ID NO: 161), maize AT (PCT Publication WO 94/13814), PLSC coco(GenBank 
accession 1098605), PLSC Lim(GenBank accession 1209507), PLSC, Ecoli (GenBank 
accession 1209507), and PLSC Yeast(GenBank accession 464422)) and glycerol-3-phosphate 

2 0 acyltransferase (PLSB Ecoli(GenBank accession 130326) and PLSB Mouse(GenBank 
accession 2498786)) (Figure 2), and similarities are identified (Figure 2 and Figure 3). 

Sequence comparisons reveal several classes of acyltransferases exist based on 
conserved amino acid sequences identified in the comparisons in Figure 2. For example, 
ATAT1, ATAT6, ATAT7, ATAT8, and ATAT9, contain the conserved amino acid 

2 5 sequences of VTYSXS(SEQ ID NO: 128), VXLTRXR(SEQ ID NO: 129), LXXGDLV(SEQ 

ID NO: 132) between the HXXXXD and PEG sequences. In addition, AT AT 1 , ATAT6, 
ATAT7, ATAT8, and ATAT9 also contain the conserved sequences CPEGT(SEQ ID NO: 
130) which comprises the PEG sequence, as well as IVPVA(SEQ ID NO: 131) and 
VANXXQ (SEQ ID NO: 134)(Figure 2) downstream of the PEG sequence. The sequences 

3 0 corresponding to ATAT1 , ATAT7, and ATAT9 are the most closely related in this class, with 

similarities between ATAT1 and ATAT9 of 67.0%, between ATAT1 and ATAT7 of 58.2% 
and between ATAT9 and ATAT7 of 63.9% (Figure 3B). 
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Sequence comparisons also demonstrate that the sequence of ATLPAAT1 is most 
closely related to the jojoba LPAAT (82.3% similar), and maize (78.07c similar). 

Furthermore, sequence analysis demonstrates that ATAT4 is the most divergent 
sequence with the highest similarity to ATAT10 (18.5%). The highest similarity (15.3%) to a 
5 known sequence is with a meadowfoam (Limnanrhes douglassi) LPAAT. However, the 

sequences of ATAT4 and AT AT 10 share several conserved peptide sequences with the amino 
acid sequences of ATAT2 and ATAT3 (Figure 2). VXNHXS (SEQ ID NO: 127) where the H 
comprises the conserved H of the HXXXXD sequence and FXXGAF (SEQ ID NO: 133) 
downstream of the PEG sequence. 

10 

Example 6: Identification of Additional Acyltransferase Sequences 

The novel Arabidopsis sequences identified above are used to search proprietary 

15 databases containing soybean and corn EST sequences. The results of this search identifies 
EST sequences from soybean (SEQ ID NO: 24 through SEQ ID NO: 85) as well as from corn 
( SEQ ID NO: 86 through SEQ ID NO: 126) as encoding acyltransferase related proteins. 

Sequence comparisons between the various EST sequences and the complete 
Arabidopsis sequences reveals that the identified EST sequences demonstrate higher 

2 0 similarity to the various Arabidopsis sequences as determined by BLAST scores. 

Expressed Sequence Tag (EST) sequences from soybean and corn databases are 
identified which are most closely related by BLAST score to ATAT1 (SEQ ID NOS:24-29 
and SEQ ID NOS:86-88, respectively), ATAT2 (SEQ ID NO: 30 and SEQ ID NO:89, 
respectively ), ATAT3 (SEQ ID NOS:31-35 and SEQ ID NOS:90-94, respectively), ATAT4 

25 (SEQ ID NOS:36-44 and SEQ ID NOS:95-100, respectively), ATAT6 (SEQ ID NOS:45-49 
and SEQ ID NO: 101, respectively), ATAT7 (SEQ ID NOS:50-54 and SEQ ID NOS: 102-103, 
respectively ), ATAT8 (SEQ ID NOS:55-56 and SEQ ID NO: 104, respectively), ATAT9 
(SEQ ID NOS:57-79 and SEQ ID NOS: 105-1 1 1, respectively), AT AT 10 (SEQ ID NOS:80- 
81 and SEQ ID NO: 112, respectively), AT ATI 1, (SEQ ID NOS:82-85 and SEQ ID 

30 NOS: 123-126, respectively), and ATLPAAT1 (SEQ ID NOS: 1 13-122 respectively). 
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Example 7: Expression Construct Preparation 

A series of synthetic oligo nucleotide primers were prepared for use in Polymerase 
Chain Reactions (PCR) to amplify the entire DNA sequences encoding the various 
acyltransferase sequences identified above. The sequences are listed in Table 3. 



Table 3 

Primer Sequence (listed 5' -3') seq ID 

NO: 

AT AT IF AAGCTTGCATGCGTCGACACAATGGTTCATGCGACCAAGT 163 
CAG 

AT AT 1 R GGTACCGTCGACTCACTTCTTGGTGTTGTTGATAG 164 

AT AT 2 F GGATCCGCGGCCGC ACAATGACGAGCTTTACTACTTCCCT 16 5 
TCAT 

ATAT2R GGATCCCCTGCAGGTTAGAGATCCATTGATTCTGCAAT 16 6 

AT AT 3 F GGATCCGCGGCCGCATAATGGAATCAGAGCTCAAAGAT 167 

AT AT 3 R GGATCCCCTGCAGGTCATTCTTCTTTCTGATGGAAATC 168 

AT AT 4 F GGATCCGCGGCCGC AC AATGACTCGTTC AC AAGATGTTTC 169 
A 

AT AT 4 R GGATCCCCTGCAGGTCACTTCTCTTCCAATCTAGCCAG 17 0 

AT AT 6 F GGATCCGCGGCCGC AC AATGTCCGGTAATAAGATCTCGAC 171 
TCTTCA 

AT AT 6 R GGATCCCCTGCAGGTTATTTTTTCTTGACAACTCCGTTAT 172 
TACCGG 

ATAT7F ATATCCGCGGCCGCACAATGGTTATGGAGCAAGCTGGAA 17 3 

AT AT 7 R GGATCCCCTGCAGGTCAATGGAGACAAGGCTCGAAAGT 17 4 

AT AT 8 F GGATCCGCGGCCGC AC AATGTCCGCCAAGATTTCAAT ATT 17 5 
CC 

AT AT 8 R GGATCCCCTGCAGGTTAATTTTTCTTAACTACTCCATT 17 6 

AT AT 9 F GGATCCGCGGCCGC AC AATGGGAGCTCAGGAGAAACGGCG 17 7 
CC 

ATAT9R GGATCCCCTGCAGGTCACGTCTTCTCCTTCTTCACCGG 17 8 

ATAT10F GGATCCGCGGCCGC AC AATGGCGGATCCTGATCTGTCTTC 17 9 
TCCT 

ATAT1 OR GGATCCCCTGCAGGTTATGTTGGGGCCAAGTCAGGTGCAA 18 0 
AGAT 

AT AT 1 1 F GGATCCGCGGCCGC AAAATGGAAAAAAAGAGTGTACC AAA 181 
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TTCT 

AT AT I 1 R GGATCCCCTGCAGGTTATTTGTTTACTAATTTGAGGGAAT 182 
TTTTTG 

ATLPAAT TCGACCTGC AGG AAGCTTAAGGATGGTGATTGCTGC 18 3 

IF 

ATLPAAT GGATCCGCGGCCGCTTACTTCTCCTTCTCCG 18 4 

1R 

YSCAT1F GGATCCGCGGCCGCACAATGTCTTTTAGGGATGTCCTAG 18 5 

YSCAT1R GGATCCCCTGCAGGTCAATCATCCTTACCCTTTGGTTTAC 18 6 

C 

YSCAT 1 ATGTCTTTTAGGGATGTCCTAGAAAGAGGAGATGAATTTT 187 

KO F CTGTGCGGTATTTCACACCG 

YSCAT 1 TCAATCATCCTTACCCTTTGGTTTACCCTCTGGAGGCAGA 188 

KO R AGATTGTACTGAGAGTGCAC 

YSCAT 2 F GGATCCGCGGCCGCACAATGAAGCATTCCCAAAAATACCG 18 9 

TAGG 

YSCAT2 R GGATCCCCTGC AGGTCAATGATTTTTTTTC ATCAC AAATA 19 0 

C 

YSCAT 2 ATGAAGCATTCCCAAAAATACCGTAGGTATGGAATTTATG 191 

KO F CTGTGCGGTATTTCACACCG 

YSCAT 2 TCAATGATTTTTTTTCATCACAAATACAAGAATAAGAAAA 192 

KO R AGATTGTACTGAGAGTGCAC 

YSCAT GGATCCGCGGCCGC AC AATGGGTTTTGTTGATTTCTTCGA 193 

3F AAC 

YSCAT GGATCCCCTGCAGGTTATTTGGTCTCAATTTTAATATTT? 19 4 

3R TTTGC 

YSCAT 3 ATGGGTTTTGTTGATTTCTTCGAAACATATATGGTCGGTT 195 

KO F CTGTGCGGTATTTCACACCG 

YSCAT 3 TTATTTGGTCTCAATTTTAATATTTTTTTGCAAGGACTCG 19 6 

KO R AGATTGTACTGAGAGTGCAC 

YSCAT GGATCCGCGGCCGC AC AATGGAAAAGT AC ACCAATTGGAG 197 

4 F AG AC 

YSCAT GGATCCCCTGCAGGCTACTTCCTCTTTTTACGTTGATCGC 19 8 

4R TG 

YSCAT 4 ATGGAAAAGTACACCAATTGGAGAGACAATGGTACGGGAA 199 

KO F CTGTGCGGTATTTCACACCG 

YSCAT 4 CTACTTCCTCTTTTTACGTTGATCGCTGATATATTCCTTC 2 00 

KO R AGATTGTACTGAGAGTGCAC 
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YSCAT 




GGATCCGCGGCCGCACAATGCCTGCACCAAAACTCACGGA 


201 


5F 




G 




YSCAT 
5R 




GGATCCCCTGCAGGCTACGCATCTCCTTCTTTCCCTTC 


202 


YSCAT 


5 


ATGCCTGCACCAAAACTCACGGAGAAATCTGCCTCTTCCA 


203 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


5 


CTACGCATCTCCTTCTTTCCCTTCTTCTTCTTCTTCCTCT 


204 


KO R 




AG ATTGT AC TG AG AGTGC AC 




YSCAT 




GGATCCGCGGCCGCACAATGTCTGCTCCCGCTGCCGATCA 


205 


6F 




TAACGC 




YSCAT 




GGATCCCCTGCAGGTCATTCTTTCTTTTCGTGTTCTCTTT 


206 


6R 




TCTG 




YSCAT 


6 


ATGTCTGCTCCCGCTGCCGATCATAACGCTGCCAAACCTA 


207 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


6 


TCATTCTTTCTTTTCGTGTTCTCTTTTCTGTCTTACCAGC 


208 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT 




GGATCCGCGGCCGCACAATGCTGCATCAAAAAATAGCTCA 


209 


7F 




TAAAGTTCG 




YSCAT 




GGATCCCCTGCAGGTCAAAAAATAAAACAATAAAGTTTAT 


210 


7R 




AAACTAACC 




YSCAT 


7 


ATGCTGCATCAAAAAATAGCTCATAAAGTTCGAAAAGTCG 


211 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


7 


TCAAAAAATAAAACAATAAAGTTTATAAACTAACCAAATT 


212 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT 




GGATCCGCGGCCGCACAATGAGTGTGATAGGTAGGTTCTT 


213 


8F 




G 




YSCAT 




GGATCCCCTGCAGGTTAATGCATCTTTTTTACAGATGAAC 


214 


8R 




C 




V CflT 
1 O I 


p 

o 




Z. _L J 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


8 


TTAATGCATCTTTTTTACAGATGAACCTTCGTTATGGGTA 


216 


KO R 




AGATTGTACTGAGAGTGCAC 





The entire coding regions for each of the acyltransfeiase sequences were amplified 
using the respective primers listed in the Table 3 above, cloned into the vector pCR2.1Topo 
(Invitrogen) orpZero (Invitrogen), and labeled as pCGN8558 (ATAT1 ), pCGN8564 
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( ATAT2), pCGB8565 (ATAT3), pCGN8566 (ATAT4), P CGN8918 (ATAT6), 
DCGN8913 (ATAT7), pCGN8904 (ATATS), pCGN9970 (ATAT9), pCGN9940 
(ATAT10), pCGN8567 (AT ATI 1 ), pCGN8632 (ATLPAAT1), P CGN9901 (YSCAT1 
also referred to as gi2132299) t pCGN9902 (YSCAT2, also referred to as gi 1078509), 
pCGN9903 (YSCAT3, also referred to as gi2 132939), pCGN9904 (YSCAT4, also 
referred to gi2 1 3303 1), pCGN9905 (YSCAT5, also referred to as gi320748), pCGN9906 
(YSCAT6, also referred to as gi549627), pCGN9907 < YSCAT7, also referred to as 
gi586485), and pCGN9908 (YSCAT8, also referred to as gi464422). The nucleic acid 
sequences for the respective yeast acyl transferase are provided YSCAT1 (SEQ rD 
NO:225), YSCAT2 (SEQ ID NO:226), YSCAT3 (SEQ ID NO:227), YSCAT4 (SEQ TD 
NO:228), YSCAT5 (SEQ ID NO:229), YSCAT6 (SEQ ID NO.230), YSCAT7 (SEQ ID 
NO:231), and YSCAT8 (SEQ ID NO:232). 
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7A. Baculovirus Expression Constructs 

Constructs are prepared to direct the expression of the Arabidopsis AT AT sequences 
m cultured insect cells. The entire coding regions of ATAT1, 2, 3, 4, 6, 7, 8, 9, 10, and 1 1 are 
cloned into the vector pFastBacl (Gibco-BRL, Gaithersburg, MD) digested withAfofl and 
5 Pstl. The respective coding sequences were cloned as NotUSseS3Sll fragments. Double 
stranded DNA sequence was obtained to verify that no errors were introduced by PCR 
amplification. The resulting plasmid were designated pCGN9723 (AT ATI), pCGN9724 
(ATAT2), pCGN9725 (ATAT3), pCGN9726 (ATAT4), pCGN9727 (ATAT5), pCGN9728 
{ ATAT7), pCGN9729 (ATAT8), pCGN9991 (ATAT9) pCGN9730 (ATAT10), pCGN9731 
10 (ATAT11). 

7B. Plant Expression Construct Preparation 

A plasmid containing the napin cassette derived from pCGN3223 (described in USPN 
5,639,790, the entirety of which is incorporated herein by reference) was modified to make it 
15 more useful for cloning large DNA fragments containing multiple restriction sites, and to 

allow the cloning of multiple napin fusion genes into plant binary transformation vectors. An 
adapter comprised of the self annealed oligonucleotide of sequence 

CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGCGCCATTTAA 
(SEQ ID NO:233) AT was ligated into the cloning vector pBC SK+ (Stratagene) after 
2 0 digestion with the restriction endonuclease BssHII to construct vector pCGN7765. Plamids 
pCGN3223 and pCGN7765 were digested with NotI and ligated together. The resultant 
vector, pCGN7770, contains the pCGN7765 backbone with the napin seed specific 
expression cassette from pCGN3223. 

The cloning cassette, pCGN7787, essentially the same regulatory elements as 

2 5 pCGN7770, with the exception of the napin regulatory regions of pCGN7770 have been 

replaced with the double CAMV 35S promoter and the tml polyadenylation and 
transcriptional termination region. 

A binary vector for plant transformation, pCGN5 139, was constructed from 
pCGN1558 (McBride and Summerfelt, (1990) Plant Molecular Biology, 14:269-276). The 

3 0 polylinker of pCGN1558 was replaced as a HindIII/Asp7 1 8 fragment with apolylinker 

containing unique restriction endonuclease sites, AscI, Pad, Xbal, Swal, BamHLand NotI. 
The Asp718 and Hindlll restriction endonuclease sites are retained in pCGN5139. 
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A series of turbo binary vectors are constructed to allow for the rapid cloning of DNA 
sequences into binary vectors containing transcriptional initiation regions (promoters) and 
transcriptional termination regions. 

The plasmid pCGN8618 was constructed by ligating oligonucleotides 5'- 
5 TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' ) (SEQ ID NO:234) and 5'- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC-3' ) ( SEQ ID NO:235) into Sall/Xhol- 
digested pCGN7770. A fragment containing the napin promoter, polylinker and napin 3' 
region was excised from pCGN8618 by digestion with Asp718I: the fragment was blunt- 
ended by filling in the 5' overhangs with Klenow fragment then ligated into pCGN5139 that 

10 had been digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs 
with Klenow fragment. A plasmid containing the insert oriented so that the napin promoter 
was closest to the blunted Asp718I site of pCGN5139 and the napin 3 1 was closest to the 
blunted Hindlll site was subjected to sequence analysis to confirm both the insert orientation 
and the integrity of cloning junctions. The resulting plasmid was designated pCGN8622. 

15 The plasmid pCGN8619 was constructed by ligating oligonucleotides 5'- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC -3 1 ) (SEQ ID NO:236) and 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' ) (SEQ ID NO:237) into Sall/Xhol- 
digested pCGN7770. A fragment containing the napin promoter, polylinker and napin 3' 
region was removed from pCGN8619 by digestion with Asp718I; the fragment was blunt- 

2 0 ended by filling in the 5 1 overhangs with Klenow fragment then ligated into pCGN5 139 that 
had been digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs 
with Klenow fragment. A plasmid containing the insert oriented so that the napin promoter 
was closest to the blunted Asp718I site of pCGN5139 and the napin 3' was closest to the 
blunted Hindlll site was subjected to sequence analysis to confirm both the insert orientation 

2 5 and the integrity of cloning junctions. The resulting plasmid was designated pCGN8623. 

The plasmid pCGN8620 was constructed by ligating oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGGAGCT -3' ) (SEQ ID NO:238) and 5'- 
CCTGCAGGAAGCTTGCGGCCGCGGATCC-3 , ) (SEQ ID NO:239) into Sall/SacI- 
digested pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region 

3 0 was removed from pCGN8620 by complete digestion with Asp718I and partial digestion with 

Notl. The fragment was blunt-ended by filling in the 5' overhangs with Klenow fragment 
then ligated into pCGN5139 that had been digested with Asp718I and Hindlll and blunt- 
ended by filling in the 5 1 overhangs with Klenow fragment. A plasmid containing the insert 
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oriented so that the d35S promoter was closest to the blunted Asp718I site of pCGN5139 and 
the tml V was closest to the blunted Hindlll site was subjected to sequence analysis to 
confirm both the insert orientation and the integrity of cloning junctions. The resulting 
plasmid was designated pCGN8624. 
5 The plasmid pCGN8621 was constructed by ligating oligonucleotides 5 1 - 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCCAGCT -3' ) (SEQ ID NO:240) and 5'- 
GGATCCGCGGCCGC AAGCTTCCTGCAGG-3 1 ) (SEQ ID NO:241) into Sall/Sacl- 
digested pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region 
was removed from pCGN8621 by complete digestion with Asp718I and partial digestion with 

10 Notl. The fragment was blunt-ended by filling in the 5' overhangs with Kienow fragment 
then ligated into pCGN5139 that had been digested with Asp718I and Hindlll and blunt- 
ended by filling in the 5 1 overhangs with Kienow fragment. A plasmid containing the insert 
oriented so that the d35S promoter was closest to the blunted Asp718I site of pCGN5139 and 
the tml 3' was closest to the blunted Hindlll site was subjected to sequence analysis to 

15 confirm both the insert orientation and the integrity of cloning junctions. The resulting 
plasmid was designated pCGN8625. 

The coding regions of the various acyltransferase sequences were cloned as 
NotVSsettZll fragments into pCGN8622, pCGN8623, pCGN8624, and pCGN8625, for 
expression in sense or antisense orientations from a tissue preferential promoter, napin, or the 

2 0 35S promoter. Fragments which were cloned into the pCGN8622 vector created the 

constructs pCGN8901 ( ATAT1), pCGN8571 (ATAT2), pCGN8909 (ATAT3), pCGN8596 
(ATAT4), pCGN8919 (ATAT6), pCGN8914 (ATAT7), P CGN8905 (ATAT8), pCGN9973 
( ATAT9), pCGN9942 (AT AT 10), pCGN8575 (ATAT1 1), and pCGN8633 (ATLPAAT1 ) for 
the sense expression of the respective coding sequences from the napin promoter. Fragments 

2 5 which were cloned into the pCGN8623 vector created the constructs pCGN8900 (AT ATI ), 

P CGN8572 (ATAT2), pCGN8910 ( ATAT3), pCGN8597 (ATAT4), pCGN8920 (ATAT6}, 
pCGN8915 ( ATAT7), pCGN8906 ( ATAT8), pCGN9972 (ATAT9), pCGN9943 (ATAT10), 
pCGN8576 ( ATAT1 1), and pCGN8634 (ATLPAAT1 ) for the antisense expression of the 
respective coding sequences from the napin promoter. Fragments which were cloned into the 

3 0 pCGN8624 vector created the constructs pCGN8903 (AT ATI), pCGN8573 (ATAT2), 

pCGN89 1 1 ( ATAT3), pCGN8598 ( ATAT4), pCGN8921 ( ATAT6), pCGN8916 (ATAT7 ), 
pCGN8907 (ATAT8), pCGN9971 (ATAT9), pCGN9944 (ATAT10), pCGN8577 (AT ATI 1), 
and pCGN8635 (ATLPAAT1) for the sense expression of the respective coding sequences 
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from the 35S promoter. Fragments which were cioned into the pCGN8625 vector created the 
constructs pCGN8902 (ATAT1) and pCGN9974 (ATAT9) for the antisense expression of 
the respective coding sequences from the 35S promoter. 

In addition, the yeast acyltransferase coding sequences were cloned into the vector 
5 pCGN8624 creating the constructs pCGN9926 (YSCAT1 ), pCGN9927 (YSCAT2), 
pCGN9928 (YSCAT3), pCGN9929 (YSCAT4), pCGN9930 (YSCAT5), pCGN9931 
(YSCAT6), pCGN9932 (YSCAT7), and pCGN9933 (YSCAT8). These constructs allow for 
the sense expression of the respective acyltransferase coding sequences from the 35S 
promoter in plant cells. 

10 

Example 8: Plant Transformation 

A variety of methods have been developed to insert a DNA sequence of interest into the 
15 genome of a plant host to obtain the transcription or transcription and translation of the sequence 
to effect phenotypic changes. 

Transgenic Brassica plants are obtained by Agrobacterium-mediaxtd transformation 
as described by Radke et al. (Theor. Appl Genet. (1988) 75:685-694; Plant Cell Reports 
(1992) 77:499-505). Transgenic A rabidopsis thaliana plants may be obtained by 
2 0 Agrobacterium-mtdiated transformation as described by Valverkens et al., (Proc. Nat. Acad. 
Sci. (1988) 55:5536-5540), or as described by Bent et al. (( 1994), Science 265:1856-1860), or 
Bechtold et al. ((1993), C.R.Acad. Sci, Life Sciences 3 16: 1 194-1 199) or Clough, et al. (1998) 
Plant J., 16:735-43. Other plant species may be similarly transformed using related 
techniques. 

2 5 Alternatively, microprojectile bombardment methods, such as described by Klein et 

al. (Bio/Technology 70:286-291) may also be used to obtain nuclear transformed plants. 

The above results demonstrate that the nucleic acid sequences identified encode 
proteins which are related to protein sequences encoding acyltransferase proteins. Such 

3 0 acyltransferase sequences find use in preparing expression constructs for plant 

transformations. 

All publications and patent applications mentioned in this specification are indicative 
of the level of skill of those skilled in the art to which this invention pertains. All 
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publications and patent applications are herein incorporated by reference to the same extent as 
if each individual publication or patent application was specifically and individually indicated 
to be incorporated by reference. 

Although the foregoing invention has been described in some detail by way of 
5 illustration and example for purposes of clarity of understanding, it will be obvious that 

certain changes and modifications may be practiced within the scope of the appended claim. 
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1. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
5 proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 127 
(VxNHxS) wherein the H is the conserved Histidine residue in the conserved peptide 
sequence HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

10 2. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 

proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 128 
(VTYSxS) within about 30 amino acids downstream from the conserved amino acid sequence 
HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

15 

3. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 129 
( VxLTRxR) within about 60 amino acids downstream from the conserved amino acid 
2 0 sequence HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

4. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 132 

2 5 (LxxGDLV) within about 20 amino acids upstream of the conserved amino acid sequence 

PEG of said acyltransferase-like protein, x representing any amino acid. 

5. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

3 0 wherein said enzyme includes the amino acid sequence of SEQ ID NO: 130 (CPEGT) 

containing the conserved amino acid sequence PEG of said acyltransferase-like protein. 
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6. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 133 
(FxxGAF) within about 20 amino acids downstream from the conserved amino acid sequence 
5 PEG of said acyltransferase-like protein, x representing any amino acid. 

7. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 131 (IVPVA) 
10 within about 40 amino acids downstream from the conserved amino acid sequence PEG of 
said acyltransferase-like protein. 

8. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

15 wherein said enzyme includes the amino acid sequence of SEQ ID NO: 134 

(VANxxQ) within about 1 10 amino acids downstream from the conserved amino acid 
sequence PEG of said acyltransferase-like protein, x representing any amino acid. 



2 0 9. A DNA sequence encoding an enzyme of the class of acyltransferase-like proteins, 

said DNA sequence obtainable by the steps comprising: 

(a) using the profile of Figure 1 to search a nucleic acid sequence database; 

(b) obtaining a probability score for nucleic acid sequences in said sequence 
database using the Smith- Waterman algorithm; and 

2 5 (c ) selecting a nucleic acid sequence having a probability score of less than about 1. 

10. The DNA encoding sequence according to Claim 9, wherein said DNA sequence 
is an encoding sequence. 

3 0 11. The DNA encoding sequence according to Claim 9, wherein said DNA sequence 

is an EST. 
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12. The DNA encoding sequence according to any one of Claims 1 to 11. wherein 
said acyltransferase-like protein is from a plant 

13. A construct comprising a DNA sequence of any one of Claims 1 to 1 1 linked to a 
5 heterologous transcriptional and translational initiation region functional in a host cell. 

14. The construct according to Claim 13 wherein said host cell is a plant cell. 

15. A plant cell comprising a DNA construct according to Claim 13. 

10 

16. A plant comprising a cell according to Claim 15. 

17. The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
15 like protein is from Arabidopsis thaliana. 

18. The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
like protein is from corn. 

2 0 19 . The DNA encoding sequence of Claim 18 wherein said sequence comprises and 

EST selected from the group consisting of SEQ ID NO: 86 through SEQ ID NO: 126. 

2 0 . The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
like protein is from soybean. 

25 

21 . The DNA encoding sequence of Claim 20 wherein said sequence comprises and 
EST selected from the group consisting of SEQ ID NO: 24 through SEQ ID NO: 85. 

2 2 . The DNA encoding sequence of any one of Claims 2, 3, 4, 5, 7 and 8 wherein 

3 0 said acyltransferase-like protein is selected from the group consisting of SEQ ID NO: 1 , SEQ 

ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14 and SEQ ID NO: 16. 
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2 3 . The DNA encoding sequence of either of Claim 1 and Claim 6 wherein said 
acyltransferase-Iike protein is selected from the group consisting of SEQ ID NO: 3, SEQ ID 
NO: 5 f SEQ ID NO: 7 and SEQ ID NO: 18. 
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20 25 30 

Pro Leu Asn Ala He He Thr Tyr Leu Trp Leu Pro Phe Gly Phe He 

35 40 45 

re- Ser -e ^ e Arg Val Tyr Phe Asn Leu Pro Leu Pro Glu Arg Phe 

50 55 60 

Vai Arg Tyr Thr Tyr Glu Met Leu Gly He His Leu Thr He Arg Gly 

65 70 75 80 

F^s Arg Pro Pro Pro Pro Ser Pro Gly Thr Leu Gly Asn Leu Tyr Val 

* " 85 90 95 

Leu Asn His Arg Thr Ala Leu Asp Pro He He Val Ala He Ala Leu 

100 105 HO 
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SEQUENCE LISTING 



<110> Lassner, Michael W 
Emig, Robin A 
Ruezinsky, Diane^ 
Van Eenennaam , Alison 

<120> Novel Plant Acyl trans f erases 

<130> 17029/00/WO 

<140> 
<141> 

<150> 60/101,939 
<151> 1998-09-25 

<160> 241 

<170> Patentln Ver . 2.0 

<210> 1 
<211> 869 
<212> DNA 

<213> Arabidopsis sp. 



<400> 1 

atggttcatg 

gtcttccatg 

ctatggcttc 

ctgaaagatt 

atcgtcctcc 

ccgcgcttga 

acagtgtc tc 

accgtgccac 

gtccggaagg 

agctaagcga 

ccacagttag 

gctatgaagc 

agactcctat 

aatgcaccga 

tggagtctat 



cgaccaagtc 
atgggcgttt 
cttttggttt 
tgtccgttac 
acctccttcc 
tcccatcanc 
tcgtctctcc 
cgatgctgcc 
cacgacgtgt 
ccggattgtg 
gggtgtgaag 
cactttcttg 
agaggtggct 
acttactcgc 
caacaacacc 



agccacaacg 
agcgcaacgt 
catctctcca 
acttacgaga 
cctggaactc 
gtcgctattg 
ctcatgcttt 
aaca tgagaa 
agagaagagt 
ccagtagcga 
ttttgggacc 
gatcgtttgc 
aattacgtcc 
aagga taaat 
aagaagtga 



attccaaaag 
ccaac tccgt 
tcattcgcgt 
tgctcgggat 
ttggcaacct 
ctcctggacg 
ctcctattcc 
aacttctcga 
atctactgag 
tgaactgtaa 
cttacttctt 
ctgaagaaat 
agaaagttat 
atcttttgct 



aacgcttaaa 
taaacgccat 
ctacttcaac 
ccacttaacc 
ccatgtcctt 
taagatctgt 
tgctgttgcc 
gaaaggcgac 
atttagcgct 
acaaggaatg 
cttcatgaac 
gactgtcaac 
cggcgcggtt 
tggaggtaat 



gaaccgcata 
tatcacatac 
ctccctttac 
attcgtggtc 
aaccaccgta 
tgcgtcactt 
ctcacccgtg 
ttggtgatat 
ctattcgcag 
ttcaacggga 
ccaagaccaa 
ggtggtggca 
ttgggcttcg 
gacggcaagg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

869 



<210> 2 
<211> 289 
<212> PRT 

<213> Arabidopsis sp . 



Me2°Val His Ala Thr Lys Ser Ala Thr Thr lie Pro Lys Glu Arg Leu 

1 5 10 

Lys Asn Arg He Val Phe His Asp Gly Arg Leu Ala Gin Arg Pro Thr 

20 25 JU 



Pro Leu Asn Ala He lie Thr Tyr Leu Trp Leu Pro Phe Gly Phe He 

35 40 45 

Leu Ser He He Arg Val Tyr Phe Asn Leu Pro Leu Pro Glu Arg Phe 

50 55 60 

Val Arg Tyr Thr Tyr Glu Met Leu Gly He His Leu Thr He Arg Gly 

65 70 75 

His Arg Pro Pro Pro Pr 



o Ser Pro Gly Thr Leu Gly Asn Leu Tyr Val 
5 90 95 



Leu Asn His Arg Thr Al 
100 



a Leu Asp Pro He He Val Ala He Ala L 
105 H° 



,eu 
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Glv Arg Lys He Cys Cys Val Thr iyr Ser Val Ser Arg Leu ier Leu 
115 " 120 125 

Met Leu Ser Pre He Pre Ala Val Ala Leu Thr Arg Asp Arg Ala Thr 
13 0 13 5 14 0 

Asp Ala Ala Asn Me: Arg Lvs Leu Leu Glu Lys Glv Asp Leu Val He 
145 15 0 155 16 G 

Cvs Pro Glu Gly Thr Thr Cvs Arg Glu Glu Tyr Leu Leu Arg Phe Ser 
165 ** 170 175 

A.a Leu Phe Ala Glu Leu Ser Asp Arg He Val Pro Val Ala Met Asn 
180 185 190 

Cvs Lys Gin Glv Met Phe Asn Glv Thr Thr Val Arg Gly Val Lys Phe 
195 200 205 

Tid Asp Pro Tyr Phe Phe Phe Met Asn Pro Arg Pro Ser Tyr Glu Ala 
210 215 220 

Thr Phe Leu Asp Arg Leu Pro Glu Glu Met Thr Val Asn Gly Gly Gly 
225 230 235 240 

Lys Thr Pro He Glu Val Ala Asn Tyr Val Gin Lys Val He Gly Ala 
245 250 255 

Val Leu Gly Phe Glu Cys Thr Glu Leu Thr Arg Lys Asp Lys Tyr Leu 
260 265 270 

Leu Leu Gly Gly Asn Asp Gly Lys Val Glu Ser He Asn Asn Thr Lys 
275 280 285 



<210> 3 
<211> 939 
<212> DNA 

<213> Arabidopsis sp. 



<400> 3 
atgacgagc t 
agaegtae tg 
ga taagaaat 
tcaggagc tg 
ctcagaggga 
atgattattg 
ttcattgeta 
ggt ttggaga 
tttc tggata 
gggatattcg 
aagcggatgg 
aagggagcat 
tc t ttcaaga 
aege taatgg 
aatgtgagag 
gaggecagaa 



gcat tcaa tg 
cacc taga tc 
caacccc tga 
tattcttttg 
ggcatccgt t 
aact ttgggc 
ate tgecate 
tctacacac t 
taat tcccat 
acccaagaag 
ctgtgttttt 
aaggegcat t 
gaacaggcaa 
tta'catcca 
geaagattge 



cct teatge t 
g t c t aacege 
aagtcaat tg 
c tc t tcttt t 
tgt tgttgc t 
egtect tctc 
t tccataagc 
atcagacac t 
tc t tagtc 1 1 
catcggttgg 
ccaagtggat 
c 1 1 cccagaa 
tacagtggc t 
aatca tgeca 
taaaccaata 
agaatcaatg 



gtcccgagtg 
t c 1 1 1 aagac 
gcaagaga ta 
cc tgaaccag 
ggcatt tegg 
ttcgatccc t 
a t t tatccgt 
cc tgctgtat 
ggaaaaagc t 
gccatgtcca 
tgc t taaaac 
ggaacacgga 
gcgaagaccg 
aegggtagtg 
catggaagca 
gate tctaa 



aaaaat t ta t 
atgatcctta 
t cac tg tgag 
agat taagt t 
ctactt ttct 
ataggagaaa 
tttacaaaat 
atgt t tcaaa 
ttaagttcat 
tgatgggtgt 
gc tgea tgga 
gtaaggatgg 
gagt tgcagt 
aaggta tac t 
aagcggatgt 



gggegaaaca 
cagatttc t t 
agcagatctt 
gagctcaaga 
cat tgtcc tg 
attccaccac 
caacatcgag 
ccaccaaagt 
cagcaagaca 
cgttcccttg 
acttttaaag 
tcggttaggt 
agttccaata 
gaaccatggg 
tetttgeaac 



60 

120 

180 

240 

300 

3 60 

420 

480 

540 

600 

660 

720 

780 

840 

900 

939 



<210> 4 
<111> 312 
<112> PRT 

<213> Arabidopsis sp . 



<400> 4 

Met Thr Ser Phe Thr Thr Ser Leu His Ala Val 
1 5 10 



Ser Glu Lys Phe 



Met Gly Glu Thr Arg Arg Thr Gly He Gin Trp Ser Asn Arg Ser Leu 



DOCID: <WO 0018888A2_I_> 



WO 00/18889 



20 



4 

25 



PCT/US99/22231 

30 



Arg His Asp Pro 
35 

Gin Leu Ala Arg 
50 



Thr Pro Asp Ser 
65 

Leu Arg Gly lie 



Leu lie Val Leu 
100 



Pro Tyr Arg Arg 

115 

lie Ser lie Tyr 
130 

Leu Pro Ser Ser 
145 

Phe Leu Asp lie 



lie Ser Lys Thr 
180 

Ser Met Met Gly 
195 

Val Asp Cys Leu 
210 

Val Phe Phe Phe 
225 

Ser Phe Lys Lys 



Val Val Pro lie 
260 

Ser Glu Gly He 
275 

Pro lie His Gly 
290 

Lys lie Ala Glu 
305 



Tyr Arg Phe Leu 
40 

Asp lie Thr Val 
55 

Ser Phe Pro Glu 
70 



Phe Phe Cys Val 
85 

Met He He Gly 



Lys Phe His His 
120 

Pro Phe Tyr Lys 
135 

Asp Thr Pro Ala 
150 

Tyr Thr Leu Leu 
165 

Gly He Phe Val 



Val Val Pro Leu 
200 

Lys Arg Cys Met 
215 

Pro Glu Gly Thr 
230 

Gly Ala Phe Thr 
245 

Thr Leu Met Gly 



Leu Asn His Gly 
280 



Ser Lys Ala Asp 
295 



Ser Met Asp Leu 
310 



Asp Lys Lys Ser 



Arg Ala Asp Leu 

60 



Pro Glu He Lys 
75 

Val Ala Gly He 
90 

His Pro Phe Val 
105 

Phe He Ala Lys 



He Asn He Glu 
140 

Val Tyr Val Ser 
155 

Ser Leu Gly Lys 
170 

He Pro He He 
185 

Lys Arg Met Asp 



Glu Leu Leu Lys 
220 



Arg Ser Lys Asp 
235 

Val Ala Ala Lys 
250 

Thr Gly Lys lie 
265 

Asn Val Arg Val 



Val Leu Cys Asn 
300 



Pro Arg Ser Ser 
45 

Ser Gly Ala Ala 



Leu Ser Ser Arg 

80 



Ser Ala Thr Phe 
95 

Leu Leu Phe Asp 
110 

Leu Trp Ala Ser 
125 

Gly Leu Glu Asn 



Asn His Gin Ser 
160 



Ser Phe Lys Phe 
175 

Gly Trp Ala Met 
190 

Pro Arg Ser Gin 
205 

Lys Gly Ala Ser 



Gly Arg Leu Gly 
240 

Thr Gly Val Ala 
255 

Met Pro Thr Gly 
270 

He He His Lys 
285 

Glu Ala Arg Ser 



<210> 5 
<211> 1197 
<212> DNA 

<213> Arabidopsis sp . 
<400> 5 

atggaatcag agctcaaaga 

cggccgttac tgaaatcaga 

ttcgcacctt acgcgaggac 

acggagaata ttaaattggc 

tcgatgagca tcttgcttct 

ccttatcgtg ggccagagga 

gctcacatgg aaggttggaa 



tttgaattcg aattcgaatc 
atccgatttg gcggctgcca 
cgatttgtat gggacgatgg 
ggttgcattg gtgactcttg 
ctattacttg atttgtaggg 
agaggaagat gaaggtggag 
acggactgtt atcgtccggt 



ctccgtcgag caaagaggac 60 

ttgaagagtt agacaaaaag 120 

gtttgggtcc tttcccgatg 180 

ttccattgcg gtttcttctc 240 

tatttacgct gttttctgct 300 

ttgtttttca ggaagattat 360 

ctgggaggtt tctctctagg 420 
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gt:ccgcr-!i tcg:rc:tgg gttttattgg attcacgaga gc:gtccaga :cgaga:tca 4B0 

gaca:gga:t c:aa:cctaa aactac:tct acagagat:a accagaaagg ggaagccgcc 541 

a:ggaggaac ctgaaagacc tggagccatit g;gtccaa:c a:g:tt:cg:a c:tggaca:*: 60C 

ttg:atcaca rgtctgcttc t:::ccaagt: t::tgt:gcca agagatcagt gggcaaacti 660 

c:tc::g:tg gcczcatrag caaatgcc:; ggttgtgtct a:gtccaaag agaagcaaaa 72C 

tcgcctgatt tcaagggtgt atctggcaca gtaaatgaaa gagttcgaga agc:ca:agc 78G 

aataaatctg c:ccaac:a: tacgc:tcci :cagaaggaa caactaccaa cggagactac 843 

ttacttacat tcaagacagg tg:at;^trtg gctggaactc cagttctt::c ggtaatatta 900 

aaatatccgt acgagcgctL cagtgtggca :gggatacca tatccggggc acgccaca:: 96C 

t'.attccttc cctgtcaagt cgtaaa:cac :tggaagtca tacggtt:a:c tgtatactac 
10 2 0 

ccatcccaag aagagaaaga cgatcccaaa ::tcatgcta gcaatgtt:g gaaattaatg 

i o a o 

gccaccgagg gcaacttgat tcraccggag "tgggactta gcgacaaaag gatatatcac 
114 0 

gcaactctca atggnaatct tagtcaaacc zgtgatttcc atcagaaaga agaatga 
1197 

<21G> 6 
<211> 398 
<212> ?RT 

<213> Arabidopsis sp . 
< 4 0 0 > 6 

Met Glu Ser Glu Leu Lys Asp Leu Asn Ser Asn Ser Asn Pro Pro Ser 
15 10 15 

Ser Lys Glu Aso Arg Pro Leu Leu Lys Ser Glu Ser Asp Leu Ala Ala 

2 0 2 5 3 0 

Ala lie Glu Glu Leu Asp Lys Lys Phe Ala Pro Tyr Ala Arg Thr Asp 
35 40 45 

Leu Tyr Gly Thr Met Gly Leu Gly Pro Phe Pro Met: Thr Glu Asn lie 
50 55 60 

Lys Leu Ala Val Ala Leu Val Tar Leu Val Pro Leu Arg Phe Leu Leu 
65 70 75 80 

Ser Met Ser lie Leu Leu Leu Tyr Tyr Leu lie Cys Arg Val Phe Thr 
85 90 95 

Leu Phe Ser Ala Pro Tyr Arg Gly Pro Glu Glu Glu Glu Asp Glu Gly 
100 105 110 

Gly Val Val Phe Gin Glu Asp Tyr Ala His Met Glu Gly Trp Lys Arg 
115 120 125 

Thr Val lie Val Arg Ser Gly Arg Phe Leu Ser Arg Val Leu Leu Phe 
130 135 140 

Val Phe Gly Phe Tyr Trp lie His Glu Ser Cys Pro Asp Arg Asp Ser 
145 150 155 160 

Asp Met Asp Ser Asn Pro Lys Thr Thr Ser Thr Glu lie Asn Gin Lys 
165 170 175 

Gly Glu Ala Ala Thr Glu Glu Pro Glu Arg Pro Gly Ala lie Val Ser 
180 185 190 

Asn His Val Ser Tyr Leu Asp lie Leu Tyr His Met Ser Ala Ser Phe 
195 2C0 205 

Pro Ser Phe Val Ala Lys Arg Ser Val Gly Lys Leu Pro Leu Val Gly 
210 215 220 

Leu lie Ser Lys Cys Leu Gly Cys Val Tyr Val Gin Arg Glu Ala Lys 
225 230 235 240 

Ser Pro Asp Phe Lys Gly Val Ser Gly Thr Val Asn Glu Arg Val Arg 
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255 



Glu Ala His Ser Asn Lys Ser Ala Pro Thr lie Men Leu Phe Pro Glu 
260 265 270 

Gly Thr Thr Thr Asn Gly Asp Tyr Leu Leu Thr Phe Lys Thr Gly Ala 
275 280 285 

Phe Leu Ala Gly Thr Pro Val Leu Pro Val lie Leu Lys Tyr Pro Tyr 
290 295 300 

Glu Arg Phe Ser Val Ala Trp Asp Thr lie Ser Gly Ala Arg His lie 
305 310 315 320 

Leu Phe Leu Leu Cys Gin Val Val Asn His Leu Glu Val He Arg Leu 
325 330 335 

Pro Val Tyr Tyr Pro Ser Gin Glu Glu Lys Asp Asp Pro Lys Leu Tyr 
340 345 350 

Ala Ser Asn Val Arg Lys Leu Met Ala Thr Glu Gly Asn Leu He Leu 
355 360 365 

Ser Glu Leu Gly Leu Ser Asp Lys Arg He Tyr His Ala Thr Leu Asn 
370 375 380 

Gly Asn Leu Ser Gin Thr Arg Asp Phe His Gin Lys Glu Glu 
385 390 395 



<210> 7 
<211> 1131 
<212> DNA 

<213> Arabidopsis sp . 



<4 0 0> 7 

atgagcagta 

aacatcgaag 

c-.gcgtgatc 

gactcgttca 

ttiattcccac 

tgcttcact t 

tcgctgaaag 

tgcagctttt 

atccgtccta 

gagcagatga 

caaagcacaa 

cgtgaaat tg 

ctcatatttc 

gcttt tgaat 

gacgccttct 

tcatgggctg 

acaggaattg 

1020 

aaggtccc t t 
1080 

aagcaacaga 
1131 



cggcagggag 
attaccttcc 
tgctagacat 
caagatgttt 
tatactgc tt 
tagc ttttgg 
gtcaagatag 
t tgtcgcc tc 
agcaggtc ta 
ccgcat ttgc 
tattagagag 
tagcaaaaaa 
ccgaagggac 
tggac tgcac 
ggaatagcag 
t tgtatgtga 
aat ttgcaga 



gctcgtgact 
ttctggttct 
ctctccaacg 
caaatcaaat 

tggggt tgt t 

gtggattatt 
gttgaggaaa 
atggaccgga 
tgttgccaac 
tgttataatg 
tgtgggatgt 
gttaagggac 
atgtgtaaat 
tgtttgtcca 
aaaacaatca 
agtgtggtac 
gagggtcaga 



tcaaaatccg 
tccatcaatg 
ctcactgaag 
cctccagaac 
gttagatac t 
ttcctttcat 
aagatagaga 
gttgtcaaat 
catact tcaa 
cagaagcatc 
atctggttca 
catgtccaag 
aataat taca 
attgcaatta 
t ttactatgc 
ttggaaccac 
gacatgatat 



age ttgacct 
aacctcgcgg 
ctgc tggtgc 
cttggaactg 
gtatcctctt 
tgtttatccc 
gggtcttggt 
atcacgggcc 
tgattgatt t 
ctggttgggt 
ategt tcaga 
gagctgacag 
cagcgatgtt 
aatacaacaa 
act tgetgea 
aaaccataag 
c tct teggge 



cgatcaccct 


6 


0 


caagctcagc 


120 


cat tgttgat 


1 


80 


gaatatttac 


2 


40 


tcccttgagg 


3 


00 


tgtaaatgcg 


3 


60 


ggaaatgat t 


4 


20 


acgtcctagc 


4 


80 


categtat tg 


5 


40 


tggtcttctg 


6 


00 


ggcaaaggat 


6 


60 


taatcctctt 


7 


20 


taagaagggt 


7 


80 


gatttttgtt 


8 


40 


actcatgaca 


9 


00 


gcccggtgaa 


9 


60 


gggtctcaaa 






tagtgaaege 







gtttegcaga gtcgatcctg gctagattgg aagagaagtg a 



<210> 8 
<211> 376 
<212> PRT 

<213> Arabidopsis sp . 
<400> 8 

Met Ser Ser Thr Ala Gly Arg Leu Val Thr Ser Lys Ser Glu Leu Asp 
15 10 15 

Leu Asp His Pro Asn He Glu Asp Tyr Leu Pro Ser Gly Ser Ser He 
20 25 30 
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Asr. Glu Pro Arg Gly Lys Leu Ssr Leu Arg Asp Leu Leu Asp lie Ser 
3 5 4 0 4 5 ~ 

Pro T.ir Leu Thr Glu Ala Ala Gly Ala lie Val Asp Asp Ser ?he Thr 
5 0 5 5 6 0 

Arc T/s Phe Lys Ser Asn Pro Pro Glu Pro Tro Asn Trp Asn He Tyr 
65 "* 7 0 7 5 * 80 

Leu Pne Pro Leu Tyr Cys Phe Gly Val Val Val Arg Tyr Cys He Leu 
85 90 95 

Phe Pro Leu Arg Cys Phe Thr Leu Ala Phe Gly Trp He He Phe Leu 
10 0 105 110 

Ser Leu Phe He Pro Val Asn Ala Leu Leu Lys Gly Gin Asp Arg Leu 
115 120 125 

Arg Lys Lys He Glu Arg Val Leu Val Glu Me: He Cys Ser Phe Phe 
13 0 13 5 14 0 

Val Ala Ser Trp Thr Gly Val Val Lys Tyr His Gly Pro Arg Pro Ser 
145 150 155 160 

He Arg Pro Lys Gin Val Tyr Val Ala Asn His Thr Ser Met He Asp 
165 17 0 175 

Phe He Val Leu Glu Gin Met Thr Ala Phe Ala Val He Met Gin Lys 
13 0 1S5 190 

His Pro Gly Trp Val Gly Leu Leu Gin Ser Thr He Leu Glu Ser Val 

195 200 2 05 

Gly Cvs He Trp Phe Asn Arg Ser Glu Ala Lys Asp Arg Glu He Val 
210 215 220 

Ala Lys Lys Leu Arg Asp His Val Gin Gly Ala Asp Ser Asn Pro Leu 
225 230 235 240 

Leu He Phe Pro Glu Gly Thr Cys Val Asn Asn Asn Tyr Thr Val Met 
245 25C 255 

Phe Lys Lys Gly Ala Phe Glu Leu Asp Cys Thr Val Cys Pro lie Ala 
260 265 270 

He Lys Tyr Asn Lys He Phe Val Asd Ala Phe Trp Asn Ser Arg Lys 
275 280 285 

Gin Ser Phe Thr Met His Leu Leu Gin Leu Met Thr Ser Trp Ala Val 
2 9 0 295 300 

Val Cys Glu Val Trp Tyr Leu Glu Pro Gin Thr He Arg Pro Gly Glu 
305 310 315 320 

Thr Gly He Glu Phe Ala Glu Arg Val Arg Aso Met He Ser Leu Arg 
325 330 " 335 

Ala Gly Leu Lys Lys Val Pro Trp Asd Glv Tyr Leu Lys Tyr Ser Arg 
340 345 350 

Pro Ser Pro Lys His Ser Glu Arg Lys Gin Gin Ser Phe Ala Glu Ser 
355 350 365 

He Leu Ala Arg Leu Glu Glu Lys 
370 375 

<210> 9 
<211> 965 
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<212> DNA 

<213> Arabidopsis sp . 



<400> 9 

gttgttaagt 

tcgatcacag 

tgggatcatc 

tctaatggta 

gcca tggc tc 

cgacccattc 

aagaaagtgc 

aggagggaat 

tctatgtgta 

agagaccgag 

gat ttaggtt 

t ntcagatat 

tagtagtagg 

gatgtaaata 

taaat t tgta 

c tatggaat t 

aaaaa 



tacaagtc tc 
ctcgattttc 
aaactngtcg 
ccgtcgtgat 
gtcaattcca 
tccgttcttg 
ggt tcgcgga 
tgaaccggaa 
gaatctc tac 
atcacagagt 
ttgtaaatct 
tgtagac tt t 
tggttttctt 
attgacatgt 
aaaacatagt 
tatattgatt 



t tcaaaaaca 
ctttactgtt 
gtaaggwaac 
cgcaaccgcc 
tggaaatcat 
tctatcttca 
taatgtgaaa 
aagcgtaccg 
catgccagcg 
tcaatattct 
tccttttgct 
gtagttgggt 
atgctccact 
aagtagtcat 
gtgcctattg 
gtgt tgaaaa 



cacacacacg 
ccgttggttt 
t tcacggacg 
atggtt tgc t 
caaaatccta 
gaggaaacga 
gatacgaaag 
aagccagtga 
aaccggatgg 
tattgact tt 
tttcggtaat 
ggtcttcttt 
tatc tact ta 
tagaaatt tg 
tacatataaa 
aacaaaaaaa 



tctctc ttca 
tcttgagna t 
gate t tcaat 
caagcaccgc 
aggttcttga 
agaaacaggg 
gtaacgggga 
c taaaccggg 
ctctgtacaa 
ttcttcttga 
attagatttt 
ttctccctt t 
cttgttttaa 
aaaaggcaaa 
ctctcttt tg 
aaaaaaaaaa 



cagccaatca 
ttttctttct 
gttgagctgt 
tctgtttctc 
tcagac tc ta 
gaagaagata 
agagtacegg 
aaagaccggt 
tgggattct t 
ttagtcaata 
ttc ttggaaa 
ttgtgtctca 
atcaagtgat 
tgaaagaata 
t tggggatat 
aaaaaaaaaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

965 



<210> 10 
<211> 1593 
<212> DNA 

<213> Arabidopsis sp . 



<400> 10 

atgtccggta 

attctccgtc 

ggee tccacc 

c tac tcaaat 

gtgataaggt 

atgggcttga 

gtggggaaat 

get ttgaaaa 

gtattcttgc 

ggtggttact 

gtggttcaag 

tcgccaagtc 

tcagacaaga 

cacga tggtc 

gccccattcg 

tccc tageca 

cacaacgacc 

1020 

acgttattgg 
1080 

aegtatagtc 
1140 

cgtgatcgag 
1200 

gtttgtccgg 
1260 

tctgaggt t t 
1320 

ggcacgaegg 
1380 

ccttcctaca 
1440 

ggagtccctg 
1500 

gggaatgect 
1560 

geeggtaata 
1593 

<210> 11 
<211> 530 
<212> PRT 



ataaga tc tc 
gttggtgtca 
aatatcaaga 
caaac tcttt 
cac ttttcct 
agacgatggt 
cagtt ttgee 
gaggaggcaa 
gagat tactt 
acc taggcat 
aagaaagact 
acagatctc t 
aaagt tggca 
gtt tagcegt 
ccgccgtct t 
atccct tcct 
taa tatcege 

acccacttta 

taagcagatt 

tcaaagatgg 

aagggactac 

gtgaegtcat 

c tagtggtct 

ccgtcaaat t 

acaatggaaa 

tggggtttga 

acggagttgt 



gactc ttcaa 
tcgtagccct 
cctatcgaat 
attcccttac 
cttagttctt 
gatgc tgagc 
taagtatttt 
gagagttgc t 
ggagatagaa 
cgtggaggat 
tggtagtggt 
c ttctctcaa 
aaccc tacca 
taagecaaca 
agccgctgca 
cgccttttcc 
cgacagaaaa 



gctcttgtct 
aaacaaaaat 
cacact ttga 
ttcatggttg 
tatccattta 
ttc tttggag 
ctagaagatg 
gtgagtgat t 
gttgtggtcg 
aagaagaacc 
egtegtctta 
ttt tgecagg 
caagatcaat 
cctt taaaca 
agactegtc t 
ggtatccacc 
agaggttgtc 



tcttcttgta 
accaaaaatg 
tattcaacgt 
tggca t tcga 
taagc t tgat 
t taaaaagga 
t tgggctcga 
taccacaagt 
gaagagacat 
ttgaaattgc 
t tggcatcac 
aaatttactt 
accctaaacc 
cactegtatt 
teggectaaa 
ttac tc tcac 
tctttgtgtg 



ccggt t tttc 
ccc ttc tcac 
cgaaggagct 
ageeggaggg 
gagctacgaa 
aagcttccga 
gatgttccag 
tatgattgat 
gaaaatggtc 
ttttgataaa 
ttcctttaac 
cgtcagaaat 
attgattttc 
attcatgtgg 
ct taccttac 
cgtcaacaac 
taaccataga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



catttcatac gctctaagaa agaaaaacat gaaagccgtg 
atctgagctt ctggctccga teaagacegt tagattgact 



teaagecatg 
gtgtagagag 
cgtacctgt t 
taaggcat tt 
gc t tgaccct 
agttaacttc 
gtgcaccaac 
caagaaaaaa 



gagaaattgc 
ccttac ttgc 
gctat tgact 
gatcccat tt 
gtc tctggaa 
gaggtggc ta 
ctcacgagaa 
taa 



tgagccaggg 
ttcggtttag 
cacacgtgac 
tcttcctttt 
gtagc tegtc 
ateaegtgea 
gaga taagta 



agatctegtg 
tccacttttc 
tttcttctat 
gaatcctt tc 
cacgtgtcga 
gcatgagatc 
cttgatcttg 
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<400> II 

Met Ser Gly Asr. 



Tyr Arg Phe Phe 

23 



Lys Tyr Gin Lys 
3 5 

Ser Asn Kis Thr 

^ 0 



Asn Ser Leu Phe 
65 

Val lie Arg Ser 



Met Ser Tyr Glu 
100 



Gly Val Lys Lys 



Tyr Phe Leu Glu 
130 



Gly Gly Lys Arg 
145 

Val Phe Leu Arg 



Met Lys Met Val 
180 

Asn Leu Glu lie 
195 

Ser Gly Arg Arg 
210 



Arg Ser Leu Phe 
225 



Ser Asp Lys Lys 



Pro Leu lie Phe 
260 

Asn Thr Leu Val 
275 

Ala Ala Arg Leu 
2 90 

Pro Phe Leu Ala 
305 

His Asn Asp Leu 



Cys Asn His Arg 
340 



Arg Lys Lys Asn 



Lys lie Ser Thr 
5 

lie Leu Arg Arg 



Cys Pro Ser Kis 
4 0 

Leu lie Phe Asn 
55 

Pro Tyr Phe Met 
7 0 

Leu Phe Leu Leu 
85 

Met Gly Leu Lys 



Glu Ser Phe Arg 
12 0 

Asp Val Gly Leu 
135 

Val Ala Val Ser 
150 

Asd Tyr Leu Glu 
155 

Gly Gly Tyr Tyr 



Ala Phe Asp Lys 
200 

Leu lie Gly lie 
215 

Ser Gin Phe Cys 
230 

Ser Trp Gin Thr 
245 

His Asp Gly Arg 



Leu Phe Met Trp 
250 

Val Phe Gly Leu 
295 

Phe Ser Gly lie 
310 

lie Ser Ala Asp 
325 

Thr Leu Leu Asp 



Met Lys Ala Val 



Leu Gin Ala Leu 



i rp Cys His Arg 
2 5 

Gly Leu His Gin 



Val Glu Gly Ala 
60 

Val Val Ala Phe 
75 

Val Leu Tyr Pro 
90 

Thr Met Val Met 
105 

Val Gly Lys Ser 



Glu Met Phe Gin 
140 

Asp Leu Pro Gin 
155 

lie Glu Val Val 
170 

Leu Gly lie Val 
185 

Val Val Gin Glu 



Thr Ser Phe Asn 
220 

Gin Glu He Tyr 
235 

Leu Pro Gin Asp 
250 

Leu Ala Val Lys 
265 

Ala Pro Phe Ala 



Asn Leu Pro Tyr 
300 

His Leu Thr Leu 
315 

Arg Lys Arg Gly 
330 

Pro Leu Tyr lie 
345 

Thr Tyr Ser Leu 



Val Phe Phe Leu 
15 

Ser Pro Lys Gin 
30 

Tyr Gin Asp Leu 
45 

Leu Leu Lys Ser 



Glu Ala Gly Gly 
80 

Phe He Ser Leu 
95 

Leu Ser Phe Phe 
110 

Val Leu Pro Lys 
125 

Val Leu Lys Arg 



Val Met He Asp 

_L D o 

Val Gly Arg Asp 
175 



Glu Asp Lys Lys 
190 

Glu Arg Leu Gly 
205 

Ser Pro Ser His 



Phe Val Arg Asn 
240 

Gin Tyr Pro Lys 
255 

Pro Thr Pro Leu 
270 

Ala Val Leu Ala 
285 

Ser Leu Ala Asn 



Thr Val Asn Asn 
320 

Cys Leu Phe Val 
335 



Ser Tyr Ala Leu 
350 



Ser Arg Leu Ser 
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355 

GIu Leu Leu Ala Pro lie 
370 

Lys Asp Gly Gin Ala Met 
385 390 

Val Cys Pro Glu Gly Thr 
405 

Ser Pro Leu Phe Ser Glu 
420 

Asp Ser His Val Thr Phe 
435 

Ala Phe Asp Pro lie Phe 
450 

Val Lys Leu Leu Asp Pro 
465 470 

Gly Val Pro Asp Asn Gly 
485 

Gin His Glu lie Gly Asn 
500 

Arg Arg Aso Lys Tyr Leu 
515 



Lys Lys 
530 



10 

360 

Lys Thr Val Arg Leu Thr 
375 380 

Glu Lys Leu Leu Ser Gin 
395 

Thr Cys Arg Glu Pro Tyr 
410 

Val Cys Asp Val lie Val 
425 

Phe Tyr Gly Thr Thr Ala 
440 

Phe Leu Leu Asn Pro Phe 
455 460 

Val Ser Gly Ser Ser Ser 
475 

Lys Val Asn Phe Glu Val 
490 

Ala Leu Gly Phe Glu Cys 
505 

lie Leu Ala Gly Asn Asn 
520 
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365 



Arg Asp Arg Val 



Gly Asp Leu Val 
400 

Leu Leu Arg Phe 
415 



Pro Val Ala He 
430 

Ser Gly Leu Lys 
445 

Pro Ser Tyr Thr 



Ser Thr Cys Arg 
480 

Ala Asn His Val 
495 



Thr Asn Leu Thr 
510 

Gly Val Val Lys 

525 



<210> 12 
<211> 1509 
<212> DNA 

<213> Arabidopsis sp . 



<400> 12 

atggttatgg 

atactgaaga 

ctaactcgtt 

agctacaaaa 

ccggagatcg 

atggacacgt 

cgagttatgg 

gaactgat tg 

cagtctgc 1 1 

ggaaaaccgg 

gcaccaatcc 

gtgatatttc 

ctcct ttgga 

ctcccattgt 

ggaaagcctc 

agaaccctaa 

acttactcaa 

1020 

agaatccgag 
1080 

gtttgtcctg 
114 0 

gctgagttaa 
1200 

gcgac tacag 
1260 

ccggtttacg 
1320 



agcaagc tgg 
acgcagat tc 
tcgctatctt 
acgcagctct 
aatcagtggc 
ggagggttt t 
tggagaggtt 
taaaccggt t 
tgaaccgtgt 
ctttgaccgc 
cggagaacta 
acgacggaag 
tcccatttgg 
gggccacacc 
ctcagccacc 
tggaccctgt 
tctcgcgctt 



aacgacatcg 
at tctcttac 
gttgtttcta 
caagctcaag 
tagagccgtt 
cage tcgtgt 
tgctaaggag 
cggttttgtc 
cgctaatttg 
c tctacaaat 
caaccacggt 
actagtgaag 
aatcattctc 
ttaegtctet 
ggcggctgga 
ggtat tatct 
atcagagatc 



tatteggteg 
t teatge teg 
tggcccgtaa 
attt ttgtag 
c tgccaaaat 
aagaagaggg 
catc ttagag 
accggtttga 
tttgttggtc 
ttcttatcgt 
gaccaacaac 
cggccaacgc 
gccgtgatcc 
cagatattcg 
aaatccggcg 
tatgtcc teg 
ttatctccca 



tgtcagagtt 
tagee ttcga 
tcacac tcct 
ccac tgttgg 
tctacatgga 
tcgtggtcac 
cagatgaggt 
ttcgegaaac 
ggaggectea 
tatgtgagga 
t tcagctacg 
cggccaccgc 
ggatct ttct 
gtggccatat 
tgctctttgt 
gaegtagcat 
t tccaaccgt 



tgaaggaaca 
agcagctggt 
tgacgtt ttc 
tc tacgtgaa 
egaegtaage 
gagaatgect 
categgtacg 
ggatgttgat 
actaggtctt 
gcatattcat 
tccacttccg 
tc tcatcatc 
tggagccgtc 
categtcaaa 
gtgtactcac 
cccagccgt t 
ccgattgaca 



atgtggatgc ggctaagatc 



agggaaccac 
eggataggat 
cgagaggctg 
agattacgt t 



ttgtcgtgaa 
tgttccggtt 
gaagggt ttg 
ct tgaaccag 



aaacaacaac tgtcaaaagg agatctagtg 
ccgtttttgt taagattcag cgcgcttttc 
gcgatgaact acagagtegg attcttccac 
gacccaattt tcttcttcat gaacccaaga 
cttcctatgg aggcaacatg ttcgtccggg 



60 

120 

180 

240 

300 

360 

420 
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540 
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900 

960 
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1 1 

aagagcccgc atgacgtggc gaac:acat: cagagaaict tggcggctac grtagggttt 
13 3 0 

gagtgcacca ac:tcacaag aaaaga:aag tataggg^tc tcgctggaaa cgatggaaca 
144C 

g-.gicgtact tgtcgttgc: agaccaa::g aagaaggtgg t:agca::tt cgagccttgt 
15C0 

ctccattga 
15C9 

<2I0> 13 
<2il> 502 
<212> PRT 

<213> Arabidopsis sp . 
<4C0> 13 

Met Val Met Glu Gin Ala Gly Thr Thr Ser Tyr Ser Val Val Ser Glu 
1 5 10 15 

Phe Glu Gly Thr lie Leu Lys Asn Ala Aso Ser Phe Ser Tyr Phe Met 
2 C 2 5 3 0 

Lou Val Ala Phe Glu Ala Ala Gly Leu lie Arg Phe Ala lie Leu Leu 
35 40 45 

Phe Leu Trp Pro Val lie Thr Leu Leu Aso Val Phe Ser Tvr Lvs Asn 
5 0 5 5 6 0 

Ala Ala Leu Lys Leu Lvs lie Phe Val Ala Thr Val Gly Leu Arg Glu 
6 5 7 0 7 5 8 0 

Pro Glu lie Glu Ser Val Ala Arg Ala Val Leu Pro Lys Phe Tyr Met 
85 90 95 

Asp Asp Val Ser Met Asp Thr Trp Arg Val Phe Ser Ser Cys Lys Lys 
100 105 110 

Arg Val Val Val Thr Arg Met Pro Arg Val Met Val Glu Arg Phe Ala 
115 120 125 

Lys Glu His Leu Arg Ala Asp Glu Val He Gly Thr Glu Leu He Val 
130 135 140 

Asn Arg Phe Gly Phe Val Thr Gly Leu lie Arg Glu Thr Asp Val Asp 
145 150 155 160 

Gin Ser Ala Leu Asn Arg Val Ala Asn Leu Phe Val Gly Arg Arg Pro 
165 170 175 

Gin Leu Gly Leu Gly Lys Pro Ala Leu Thr Ala Ser Thr Asn Phe Leu 
180 185 190 

Ser Leu Cys Glu Glu His He His Ala Pro He Pro Glu Asn Tyr Asn 
195 200 205 

His Gly Asp Gin Gin Leu Gin Leu Arg Pro Leu Pro Val He Phe His 

210 215 220 

Asp Gly Arg Leu Val Lys Arg Pro Thr Pro Ala Thr Ala Leu He He 
225 230 235 240 

Leu Leu Trp lie Pro Phe Gly He He Leu Ala Val lie Arg He Phe 
245 250 255 

Leu Gly Ala Val Leu Pro Leu Trp Ala Thr Pro Tyr Val Ser Gin He 
260 265 270 

Phe Gly Gly His He He Val Lys Gly Lys Pro Pro Gin Pro Pro Ala 
275 280 285 

Ala Gly Lys Ser Gly Val Leu Phe Val Cys Thr His Arg Thr Leu Met 
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Asp Pro Val Val Leu Ser Tyr Val Leu Gly Arg Ser lie Pro Ala Val 
305 310 315 320 

Thr Tyr Ser lie Ser Arg Leu Ser Glu lie Leu Ser Pro He Pro Thr 
325 330 335 

Val Arg Leu Thr Arg He Arg Asp Val Asp Ala Ala Lys He Lys Gin 
340 345 350 

Gin Leu Ser Lys Gly Aso Leu Val Val Cys Pro Glu Gly Thr Thr Cys 
355 360 365 

Arg Glu Pro Phe Leu Leu Arg Phe Ser Ala Leu Phe Ala Glu Leu Thr 
370 375 380 

Asp Arg He Val Pro Val Ala Met Asn Tyr Arg Val Gly Phe Phe His 
385 390 395 400 

Ala Thr Thr Ala Arg Gly Trp Lys Gly Leu Asp Pro He Phe Phe Phe 
405 410 415 

Met Asn Pro Arg Pro Val Tyr Glu He Thr Phe Leu Asn Gin Leu Pro 
420 425 430 

Met Glu Ala Thr Cys Ser Ser Gly Lys Ser Pro His Asp Val Ala Asn 
435 440 445 

Tyr Val Gin Arg He Leu Ala Ala Thr Leu Gly Phe Glu Cys Thr Asn 
450 455 460 

Phe Thr Arg Lys Asp Lys Tyr Arg Val Leu Ala Gly Asn Asp Gly Thr 
465 470 475 480 

Val Ser Tyr Leu Ser Leu Leu Asp Gin Leu Lys Lys Val Val Ser Thr 
485 490 495 

Phe Glu Pro Cys Leu Kis 
500 



<210> 14 
<211> 1563 
<212> DNA 

<213> Arabidopsis sp . 



<400> 14 

atgtccgcca 

cggcgatatc 

gacctatcac 

ctcttccctt 

ctcttcat tc 

gtaatggtga 

cctaaatac t 

aagaaaatcg 

tact tggaga 

ggtatcatgg 

agactaaaca 

ctattctc tc 

caaaccctac 

atcaaaccaa 

gccgcagcag 

c tcgcctt tt 

aaaccaagtc 

1020 

ctctatgt tg 
1080 

agggtatc tg 
1140 



agatttcaat 
ggaactc taa 
gccacacat t 
acttcatgtt 
tctatccat t 
get tc t tegg 
ttc tagaaga 
gagtgagtga 
ttgacgt tgt 
aggataaaac 
ccggtcgtgt 
agttt tgeca 
cacgaagcca 
ccctaatgaa 
ccagac tc t t 
ccggt tgcag 
aacgeaaagg 

cattege 1 1 1 

agat tt tggc 



attccaagct 
accaaaatac 
gate t tcaac 
agtagcat tt 
gataagcttg 
gatcaaaaaa 
tgteggae tc 
tgatct tcct 
ggtegggaga 
caaacatgat 
tattggcatc 
ggaaat ttat 
gtaccc taaa 
cactttggtc 
cgtctctc t t 
ac taaccgtc 
ttgtctcttt 



cttgtctttc 
caaaatggee 
gtagaaggag 
gaggegggag 
atgagccatg 
gaaggttttc 
gagatct teg 
caagt tatga 
gaaatgaaag 
cttgtctttg 
acttccttca 
ttcgtgaaga 
ccat tgattt 
ttgttcatgt 
tgcatccct t 
ac taacgac t 
gtatgtaacc 



tattc taccg 
cttcttctct 
ctct tctcaa 
gcgtaataag 
agatgggtgt 
gageggggag 
aagtgt tgaa 
tcgaagggtt 
tegt tggagg 
atgagttagt 
atacatctct 
aatcagacaa 
tccatgatgg 
ggggtcc t tt 
actc tttatc 
aegt ttcatc 
ataggac t tt 



gt ttatcc tc 
cc tccaatcc 
atccgactct 
gtcatttctc 
caaagtgatg 
ageggt tttg 
gagaggaggg 
cttgagagat 
ttat tatc ta 
tegtaaagag 
tcaccgatat 
gcgaagctgg 
ccgtctcgcg 
cgcagccgca 
aatcccgatc 
tcaaaaacaa 
attggaccc t 
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240 
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420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



gagaaagaaa aacatcaaaa ctgtaacgta tagtttgagt 
tccgatcaag aeggtgagae tgacccgtga tegggtgage 
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1 .5 

gacggtcaag ccatggagaa attg:;aacc gaaggagatc tcgtrgtttg tcc'gaagga 
1200 

accacttcta gagaaccc^a cct.gc:tagg tttagccctc :gt:caccga ggttagtgat 
126C 

g*:cat:cgttc ccg^ggctgt gacgg:a:ac g:gaccttct :ctacggtac aacggcgagt 
1320 

gg:cttaagg cac::gaccc gc;:ttc::c c:c:"gga:c c:tarcc:a: ccaca:catc 
13 5 0 

caat:tc:cg accctgtc:c cggtgccacg :gccaagatc ctgatggaaa gttgaagtt: 
1440 

gaggtggcca acaatgttca gag:gatatt gggaaggcgc tggatttcga gtgca:aagt 
15 0 0 

ctcac:agaa aagacaagia titgatctcg gccggtaata atggagtagt taagaaaaat 

1560 

taa 

1563 

< 1 0 > 15 
<211> 520 
<212> PRT 

<113> Arabidopsis sp . 
<400> 15 

Met: Ser Ala Lys He Ser He ?he Gin Ala Leu Val Phe Leu Phe Tyr 
15 10 15 

Arg Phe He Leu Arg Arg Tyr Arg Asn Ser Lys Pro Lys Tyr Gin Asn 
20 25 30 

Gly Pro Ser Ser Leu Leu Gin Ser Asp Leu Ser Arg His Thr Leu He 
35 40 45 

Phe Asn Val Glu Gly Ala Leu Leu Lvs Ser Asp Ser Leu Phe Pro Tyr 
50 55 60 

Phe Mec Leu Val Ala Phe Glu Ala Gly Gly Val He Arg Ser Phe Leu 
65 70 75 80 

Leu Phe He Leu Tyr Pro Leu He Ser Leu Met Ser His Glu Me: Gly 
85 90 95 

Val Lys Val Met Val Met Val Ser Phe Phe Gly He Lys Lys Glu Gly 
100 105 110 

Phe Arg Ala Gly Arg Ala Val Leu Pro Lys Tyr Phe Leu Glu Asp Val 
115 120 125 

Gly Leu Glu He Phe Glu Val Leu Lys Arg Gly Gly Lys Lys He Gly 
130 135 140 

Val Ser Asp Asp Leu Pro Gin Val Met He Glu Gly Phe Leu Arg Asp 
145 " 150 155 160 

Tyr Leu Glu He Asp Val Val Val Gly Arg Glu Met Lys Val Val Gly 
165 170 175 

Gly Tyr Tyr Leu Gly He Met Glu Asp Lys Thr Lys His Asp Leu Val 
180 135 190 

Phe Asp Glu Leu Val Arg Lys Glu Arg Leu Asn Thr Gly Arg Val He 
195 200 205 

Gly He Thr Ser Phe Asn Thr Ser Leu His Arg Tyr Leu Phe Ser Gin 
210 215 220 

Phe Cys Gin Glu He Tyr Phe Val Lys Lys Ser Asp Lys Arg Ser Trp 
225 23 0 235 240 

Gin Thr Leu Pro Arg Ser Gin Tyr Pro Lys Pro Leu He Phe His Asp 
245 °50 255 
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Gly Arg Leu Ala lie Lys Pro Thr Leu Met Asn Thr Leu Val Leu Phe 
260 265 270 

Met Trp Gly Pro Phe Ala Ala Ala Ala Ala Ala Ala Arg Leu Phe Val 
275 280 285 

Ser Leu Cys lie Pro Tyr Ser Leu Ser lie Pro lie Leu Ala Phe Ser 
290 295 300 

Gly Cys Arg Leu Thr Val Thr Asn Asp Tyr Val Ser Ser Gin Lys Gin 
305 310 315 320 

Lys Pro Ser Gin Arg Lys Gly Cys Leu Phe Val Cys Asn His Arg Thr 
325 330 335 

Leu Leu Asp Pro Leu Tyr Val Ala Phe Ala Leu Arg Lys Lys Asn lie 
340 345 350 

Lys Thr Val Thr Tyr Ser Leu Ser Arg Val Ser Glu lie Leu Ala Pro 
355 360 365 

He Lys Thr Val Arg Leu Thr Arg Asp Arg Val Ser Asp Gly Gin Ala 
370 375 380 

Met Glu Lys Leu Leu Thr Glu Gly Asp Leu Val Val Cys Pro Glu Gly 
385 390 395 400 

Thr Thr Cys Arg Glu Pro Tyr Leu Leu Arg Phe Ser Pro Leu Phe Thr 
405 410 415 

Glu Val Ser Asp Val He Val Pro Val Ala Val Thr Val His Val Thr 
420 425 430 

Phe Phe Tyr Gly Thr Thr Ala Ser Gly Leu Lys Ala Leu Asp Pro Leu 
435 440 445 

Phe Phe Leu Leu Asp Pro Tyr Pro Thr Tyr Thr lie Gin Phe Leu Asp 
450 455 460 

Pro Val Ser Gly Ala Thr Cys Gin Asp Pro Asp Gly Lys Leu Lys Phe 
465 470 475 480 

Glu Val Ala Asn Asn Val Gin Ser Asp He Gly Lys Ala Leu Asp Phe 
485 490 495 

Glu Cys Thr Ser Leu Thr Arg Lys Asp Lys Tyr Leu He Leu Ala Gly 
500 505 510 

Asn Asn Gly Val Val Lys Lys Asn 
515 520 



<210> 16 
<211> 1506 
<212> DNA 

<213> Arabidopsis sp . 
<400> 16 

atgggagctc aggagaaacg gcgccgtttc gagcagatat caaagtgcga tgttaaggac 60 

cggtccaacc ataccgtggc cgctgatcta gacggaacac tactaatctc tcgtagcgcc 120 

ttcccttact atttcctcgt agccctcgag gcagggagct tgctccgagc gttgatccta 180 

cttgtgtccg taccattcgt ttatcttacg tacttgacca tctccgagac tttagccatc 240 

aacgtatttg tcttcatcac gttcgcgggt ctcaagatcc gagacgttga gctagtggtc 300 

cgttccgtcc tcccgaggtt ctatgcggag gacgtgaggc ccgatacctg gcgtatcttc 360 

aacacgttcg ggaaacggta cataataact gcgagccctc gaattatggt cgagccattc 420 

gtgaaaacat tcctaggagt tgataaagtt cttggaacag agctagaggt ctccaaatcg 480 

ggtcgggcaa ccgggttcac cagaaaacca ggtattctcg tcggtcagta caaacgtgac 540 

gtcgttttga gagagtttgg tggcctagcg tctgatttac ctgatttggg gctcggcgat 600 

agcaagacgg accacgactt catgtccatc tgcaaggaag gttacatggt gccacgtacg 660 
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aaatgcgaac ca::ac:aag aaacaaac:c ::aaccccca caatattcca cgagggcaga 720 
t:ag:ccaac gcccaa:gcc gt:agttgc: c:gtcaactt tcctctcgct tcccgtcgg: 73C 
:tcg:cc:ct ctatcatccg cgtctacacg aa:atcccgt taccggaacg ta:cgcccg: 840 
nacaactaca age t :a::gg ca:caagc:a g~cgtcaacg gccacccccc tccgccgcca 9CC 
aaa:ctggcc agccaggcca tctrttggtc tgcaaccacc gcaccgttct cgarcctgtg 960 
gtcacagctg :cgcac:cgg ceggaaaate agccgcgtca cttacagca: cagcaagttc 
10 2 0 

ccigagctaa tc:caccaa: caaagecgnt gcgttgac:c gccaacgtga gaaagacgea 
10 3) 

geqaacatea agcg:c:::: ggaggaaggc gacctcgtga tatgtcccga gggaaccacg 
114 0 

tgccgtgagc cttticcttic: ceggcttag- gctcttttcg ctgagctcac ggaceggate 

1 2 0 0 

gticccgtgg cgatcaacac aaagcagagc atgttcaatg g:accaccac acgtggatac 
126 3 

a age tr.cr,rg a:cc:tact: tgcgttcatg aacccgaggc cgacgta:ga gatcaegtte 
1 3 1 0 

c:caaacaga ctccagctga gctgacgtgt aaaggaggca aatctccgat agaggttgcg 

13 PO 

aat.-acatac agagggtttt gggaggaacc ttaggttttg agtgcaccaa tttcacaaga 

144 0 

aacgataagt aegcaatget ngctggtiac t gaeggtaggg ttccggtgaa gaaggagaag 
1 5 C 0 
acctga 
1 5 C 6 

< 2 i 0 > 17 
<211> 500 
<112> ?RT 

<213> Arabidopsis sp . 
<4C-0> 17 

Met Gly Ala Gin Glu Lys Arg Arg Arg Phe Glu Gin lie Ser Lys Cys 
15 10 15 

Asr. Val Lys Asd Arg Ser Asn His Thr Val Ala Ala Asp Leu Asp Gly 
2 0 2 5 3 0 

Thr Leu Leu lie Ser Arg Ser Ala Phe Pro Tyr Tyr Phe Leu Val Ala 
35 40 45 

Leu Glu Ala Gly Ser Leu Leu Arg Ala Leu lie Leu Leu Val Ser Val 
50 55 60 

Pre Phe Val Tyr Leu Thr Tyr Leu Thr lie Ser Glu Thr Leu Ala lie 
65 70 75 80 

Asn Val Phe Val Phe lie Thr Phe Ala Gly Leu Lys lie Arg Asp Val 
85 90 95 

Glu Leu Val Val Arg Ser Val Leu Pro Arg Phe Tyr Ala Glu Asp Val 
100 1C5 110 

Arg Pro Asp Thr Trp Arg lie Phe Asn Thr Phe Gly Lys Arg Tyr lie 
115 120 125 

lie Thr Ala Ser Pro Arg He Met. Val Glu Pro Phe Val Lys Thr Phe 
130 135 140 

Leu Gly Val Asp Lys Val Leu Gly Thr Glu Leu Glu Val Ser Lys Ser 

145 150 155 160 

Gly Arg Ala Thr Gly Phe Thr Arg Lys Pro Gly He Leu Val Gly Gin 
165 170 175 

Tyr Lys Arg Asp Val Val Leu Arg Glu Phe Gly Gly Leu Ala Ser Asp 
180 185 190 

Leu Pro Asp Leu Gly Leu Gly Asp Ser Lys Thr Aso His Asp Phe Met 
195 200 2 05 



OCID: <wo ooiaaa9A2_i_> 



WO 00/18889 



16 



PCT/US99/22231 



Ser lie Cys Lys Glu Gly Tyr Met Val Pro Arg Thr Lys Cys Glu Pro 
210 215 220 

Leu Pro Arg Asn Lys Leu Leu Ser Pro lie lie Phe His Glu Gly Arg 
225 230 235 240 

Leu Val Gin Arg Pro Thr Pro Leu Val Ala Leu Leu Thr Phe Leu Trp 
245 250 255 

Leu Pro Val Gly Phe Val Leu Ser lie He Arg Val Tyr Thr Asn He 
260 265 270 

Pro Leu Pro Glu Arg He Ala Arg Tyr Asn Tyr Lys Leu Thr Gly He 
275 280 285 

Lys Leu Val Val Asn Gly His Pro Pro Pro Pro Pro Lys Pro Gly Gin 
290 295 300 

Pro Gly His Leu Leu Val Cys Asn His Arg Thr Val Leu Asp Pro Val 
305 310 315 320 

Val Thr Ala Val Ala Leu Gly Arg Lys He Ser Cys Val Thr Tyr Ser 
325 330 335 

He Ser Lys Phe Ser Glu Leu He Ser Pro He Lys Ala Val Ala Leu 
340 345 350 

Thr Arg Gin Arg Glu Lys Asp Ala Ala Asn lie Lys Arg Leu Leu Glu 
355 360 365 

Glu Gly Asp Leu Val He Cys Pro Glu Gly Thr Thr Cys Arg Glu Pro 
370 375 380 

Phe Leu Leu Arg Phe Ser Ala Leu Phe Ala Glu Leu Thr Asp Arg He 
385 390 395 400 

Val Pro Val Ala He Asn Thr Lys Gin Ser Met Phe Asn Gly Thr Thr 
405 410 415 

Thr Arg Gly Tyr Lys Leu Leu Asp Pro Tyr Phe Ala Phe Met Asn Pro 
420 425 430 

Arg Pro Thr Tyr Glu He Thr Phe Leu Lys Gin He Pro Ala Glu Leu 
435 440 445 

Thr Cys Lys Gly Gly Lys Ser Pro He Glu Val Ala Asn Tyr He Gin 
450 455 460 

Arg Val Leu Gly Gly Thr Leu Gly Phe Glu Cys Thr Asn Phe Thr Arg 
465 470 475 480 

Lys Asp Lys Tyr Ala Met Leu Ala Gly Thr Asp Gly Arg Val Pro Val 
485 490 495 

Lys Lys Glu Lys 
500 



<210> 18 
<211> 1620 
<212> DNA 

<213> Arabidopsis sp . 
<400> 18 

atggcggatc ctgatctgtc ttctcctttg atccaccatc aatcctccga tcaacctgaa 60 
gttgttatct ctatcgccga cgacgacgac gacgagtcag gactcaatct tcttccagcc 120 
gttgttgacc ctcgtgtttc acgaggtttt gagtttgacc atcttaatcc ttatggcttt 180 
ctcagcgagt cagagcctcc ggttctcggt ccgacgacgg tggatccatt ccggaacaat 240 
acacctggag ttagcggatt gtacgaagcg attaagctcg tgatttgtct tccgattgct 300 
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cngattagac t:tg::ccctc tgctgctagc ti-agccgttg gttacccggc :acaaaat:g 360 
g:acttgctg gctggaaaga caaagagaac cczazgcczc "t:ggaga:g cagaatcarg 420 
tggattac^c ggatctgtac cagatgta:c ctctcctcct ttggc;atca gcggaiaaga 430 
aggaaaggga aacctgctcg gagagagatt gcticcga::g t:gta:caaa tcatgttrcc 540 
tatatcgaac caatcttc:a c::cca:gaa t:ta~caccga ccattgtrgc arcggagcca 600 
catgattcac ttccaticg: tggaacrat: aticagggcaa cgcaggtgar ara:gcgaat 660 
agattctcac agacaccaag gaagaa:gct g:gcangaaa taaagagaaa agc:tcctgc 720 
gatagatrtc ctcgcctgct g:ta::cccc gaaggaacca cgactaacgg gaaagttc:: 780 
a"ttcct:cc aactcggngc zztcazcccz ggzzacccza. "ccaacctgr agtagtccgg 840 
tatccccatg cacattttga tcaat:ctgg ggaaa:atct ctttgrtgac gctcatg:: z 900 
agaangtnca c:cagt:t;ca caatt:catg gaggttgaat: atcrtccngt aa:cta:ccc 960 
agtgaaaagc aaaagcagaa cgctgcgcgt ctctcacaga agactagtca tgcaattgca 
1020 

acatcttcga argtcgrcca aacatcccat tcttttgcgg acttgatgcr actcaacaaa 
1080 

gcaactgag: caaagctgga gaacccctca aatcacatgg ::gaaatggc aagagttgag 
114 0 

tcgccattc: atgtaagcag c r tagaggca acacgatttt iggatacatt tgtttccatg 
1200 

aniccggac: cgag~ggacg tgttaggzta cangactttc ttcggggtct taaactaaaa 
1260 

cct^gccctc zzz c naaaa g gatac't-gag tccatcgatg tggagaaggn cggaccaatc 

13 2 0 

ac::tcaaac agrtcttgt: tgcctcgggc cacgtgtcga cacagccgcr tttcaaqcaa 
1380 

acatgcgag: tagccttttc ccat t.gccrat gcagatggag atggctatat tacaattcaa 

14 4 0 

gaactcggag aagctctcaa aaacacaatc ccaaacttga acaaggacga gattcgagga 
1500 

atgtaccat: tgcragacga cgaccaagat caaagaatca gccaaaatga cttgttgtcc 
1560 

tgcttaagaa gaaaccctct tctca'agcc atcttztgcac ctgactztggc cccaacataa 
1620 

<210> 19 
<211> 539 
<212> PRT 

<213> Arabidopsis sp . 
<400> 19 

Met Ala Aso Pro Asp Leu Ser Ser Pro Leu lie His His Gin Ser Ser 
15 10 15 

Asp Gin Pro Glu Val Val lie Ser lie Ala Asp Asp Asp Asp Asp Glu 
20 25 30 

Ser Gly Leu Asn Leu Leu Pro Ala Val Val Asp Pro Arg Val Ser Arg 
35 40 45 

Gly Phe Glu Phe Asp Kis Leu Asn Pro Tyr Gly Phe Leu Ser Glu Ser 
50 55 60 

Glu Pro Pro Val Leu Gly Pro Thr Thr Val Asp Pro Phe Arg Asn Asn 
65 70 75 80 

Thr Pro Glv Val Ser Gly Leu Tyr Glu Ala lie Lys Leu Val lie Cys 
85 90 95 

Leu Pro lie Ala Leu lie Arg Leu Val Leu Phe Ala Ala Ser Leu Ala 
10C 105 110 

Val Gly Tyr Leu Ala Thr Lys Leu Ala Leu Ala Gly Trp Lys Asp Lys 
115 12 0 12 5 

Glu Asn Pro Met; Pro Leu Trp Arg Cys Arg lie Met Trp lie Thr Arg 
130 135 140 

lie Cys Thr Arg Cys lie Leu Phe Ser Phe Gly Tyr Gin Trp lie Arg 
145 150 155 160 



cociD: <wo ooiaea9A2_i. 



WO 00/18889 PCT/US99/22231 

18 

Arg Lys Gly Lys Pro Ala Arg Arg Glu lie Ala Pro lie Val Val Ser 

155 170 175 

Asn His Val Ser Tyr lie Glu Pro lie Phe Tyr Phe Tyr Glu Leu Ser 
180 185 190 

Pro Thr lie Val Ala Ser Glu Ser His Asp Ser Leu Pro Phe Val Gly 
195 200 205 

Thr lie lie Arg Ala Met Gin Val lie Tyr Val Asn Arg Phe Ser Gin 
210 215 220 

Thr Ser Arg Lys Asn Ala Val His Glu lie Lys Arg Lys Ala Ser Cys 
225 230 235 240 

Asp Arg Phe Pro Arg Leu Leu Leu Phe Pro Glu Gly Thr Thr Thr Asn 
245 250 255 

Gly Lys Val Leu lie Ser Phe Gin Leu Gly Ala Phe lie Pro Gly Tyr 
260 265 270 

Pro lie Gin Pro Val Val Val Arg Tyr Pro His Val His Phe Asp Gin 
275 280 285 

Ser Trp Gly Asn lie Ser Leu Leu Thr Leu Met Phe Arg Met Phe Thr 
290 295 300 

Gin Phe His Asn Phe Met Glu Val Glu Tyr Leu Pro Val lie Tyr Pro 
305 310 315 320 

Ser Glu Lys Gin Lys Gin Asn Ala Val Arg Leu Ser Gin Lys Thr Ser 
325 330 335 

His Ala lie Ala Thr Ser Leu Asn Val Val Gin Thr Ser His Ser Phe 
340 345 350 

Ala Asp Leu Met Leu Leu Asn Lys Ala Thr Glu Leu Lys Leu Glu Asn 
355 360 365 

Pro Ser Asn Tyr Met Val Glu Met Ala Arg Val Glu Ser Leu Phe His 
370 375 380 

Val Ser Ser Leu Glu Ala Thr Arg Phe Leu Asp Thr Phe Val Ser Met 
385 390 395 400 

lie Pro Asp Ser Ser Gly Arg Val Arg Leu His Asp Phe Leu Arg Gly 
405 410 415 

Leu Lys Leu Lys Pro Cys Pro Leu Ser Lys Arg lie Phe Glu Phe lie 
420 425 430 

Asp Val Glu Lys Val Gly Ser lie Thr Phe Lys Gin Phe Leu Phe Ala 
435 440 445 

Ser Gly His Val Leu Thr Gin Pro Leu Phe Lys Gin Thr Cys Glu Leu 
450 455 460 

Ala Phe Ser His Cys Asp Ala Asp Gly Asp Gly Tyr lie Thr lie Gin 
465 470 475 480 

Glu Leu Gly Glu Ala Leu Lys Asn Thr lie Pro Asn Leu Asn Lys Asp 
485 490 495 

Glu He Arg Gly Met Tyr His Leu Leu Asp Asp Asp Gin Asp Gin Arg 
500 505 510 

He Ser Gin Asn Asp Leu Leu Ser Cys Leu Arg Arg Asn Pro Leu Leu 
515 520 525 

He Ala lie Phe Ala Pro Asp Leu Ala Pro Thr 
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5 3 0 5 3 5 

<210> 2 0 
<211> 112 3 
<212> DNA 

<213> Arabidopsis sp . 
<40 0> 2 0 

atggaaaaaa agag tgtac z aaattc:ga: aagttgcctc tgattagagt gttaagagg: 60 
ataatatg:c tgatggtgtt agtttcaaca gc:tt:atga tg'tgatatt ctgggggttc 120 
::atcagctg tagigtcgag g-ttttcagc a::cgc:ata gccgtaaatg tgtttccttc ISO 
ttctttggct cgtggctcg: cttgtggcct ctcctctttg agaagatcaa caaaaccaaa 240 
gttatzttct ctgg:gataa ggntccttgc gagga:cgag tattgctca: tgcaaaccac 300 
cgaacagaag ttgattggat gtacttctgg gatcttgcac tgcgtaaagg ccagattggg 360 
aa:accaaac atgngccraa gagtagtttg atgaaa:*ac ctctctctgg tcgggcgtct 420 
cacc::tttg agtttat:c: tg:tgagagg agatgggaag tcgatgaagc aaacttgaga 430 
cagatagt't cgag::r;:aa ggatcccrga gacgc::tar ggctrgctc: tttccccgag 540 
ggcacagatt acacagaggc taaatgccaa aggagtaaga aatttgctgc tgaaaatggc 600 
cttccgatac tgaacaa;g: gctgctt:cc aggacaaaag g'ttcgtctic ctgcttgcaa 660 
gaactgagtc gctcac::ga cgcagttcat gatgtgacca tcggttataa aacccgctgc 720 
ccatctttct tagacaacgt ttatggaatt gagccatcag aagttcacat ccacatccgt 780 
c?tat:aacc tgacccaaat cccaaatcaa gaaaaggaca tcaatgcttg gttaatgaac 840 
a:at::cagc tcaaaga:ca gctgctcaat gact::tact ccaatggtca tttccctaac 900 
g^aggaacag agaaagagtt caacacaaag aagtacctca taaactgttc ggcagtgatt 960 
g;ctr:acca ccatctg:ac acatctcacc ttcttcccat caatgatttg gttcaggatt 
H2 0 

tatgtctcrt tggcctgtgt ctacttgacc tctgctacgc atttcaatct tcgttctgtt 
10 8 0 

ccac:tgttg agactgcaaa aaattccctc aaattagtaa acaaataa 
1123 

<210> 21 
<111> 375 

< _ 1 2 > PRT 

<113> Arabidopsis sp . 

< -» U U > jL A. 

Mot Glu Lys Lys Ser Val Pro Asn Ser Asp Lys Leu Ser Leu lie Arg 
15 10 15 

Val Leu Arg Gly lie lie Cys Leu Met Val Leu Val Ser Thr Ala Phe 
20 25 30 

Met Met Leu He Phe Trp Gly Phe Leu Ser Ala Val Val Leu Arg Leu 
35 40 45 

Phe Ser He Arg Tyr Ser Arg Lys Cys Val Ser Phe Phe Phe Gly Ser 
50 55 60 

Trp Leu Ala Leu Trp Pro Phe Leu Phe Glu Lys lie Asn Lys Thr Lys 
65 70 75 80 

Val He Phe Ser Glv Asp Lys Val Pro Cvs Glu Asp Arg Val Leu Leu 
8 5 9 0 9 5 

He Ala Asn His Arq Thr Glu Val Asp Trp Met Tyr Phe Trp Asp Leu 
100 105 110 

Ala Leu Arg Lys Gly Gin He Gly Asn He Lys Tyr Val Leu Lys Ser 
115 120 125 

Ser Leu Met Lys Leu Pro Leu Phe Gly Trp Ala Phe His Leu Phe Glu 
130 135 140 

Phe He Pro Val Glu Arg Arg Trp Glu Val Asp Glu Ala Asn Leu Arg 
145 150 155 160 

Gin He Val Ser Ser Phe Lys Aso Pro Arg Asp Ala Leu Tro Leu Ala 
165 * 170 175 
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Leu Phe Pro Glu Gly Thr Asp Tyr Thr Glu Ala Lys Cys Gin Arg Ser 
180 185 190 

Lys Lys Phe Ala Ala Glu Asn Gly Leu Pro lie Leu Asn Asn Val Leu 
195 200 205 

Leu Pro Arg Thr Lys Gly Phe Val Ser Cys Leu Gin Glu Leu Ser Cys 
210 215 220 

Ser Leu Asp Ala Val Tyr Asp Val Thr lie Gly Tyr Lys Thr Arg Cys 
225 230 235 240 

Pro Ser Phe Leu Asp Asn Val Tyr Gly lie Glu Pro Ser Glu Val His 
245 250 255 

lie His lie Arg Arg lie Asn Leu Thr Gin lie Pro Asn Gin Glu Lys 
260 265 270 

Asp lie Asn Ala Trp Leu Met Asn Thr Phe Gin Leu Lys Asp Gin Leu 
275 280 285 

Leu Asn Asp Phe Tyr Ser Asn Gly His Phe Pro Asn Glu Gly Thr Glu 
290 295 300 

Lvs Glu Phe Asn Thr Lys Lys Tyr Leu lie Asn Cys Leu Ala Val lie 
305 310 315 320 

Ala Phe Thr Thr lie Cys Thr His Leu Thr Phe Phe Ser Ser Met lie 
325 330 335 

Trp Phe Arg He Tyr Val Ser Leu Ala Cys Val Tyr Leu Thr Ser Ala 
340 345 350 

Thr His Phe Asn Leu Arg Ser Val Pro Leu Val Glu Thr Ala Lys Asn 
355 360 365 

Ser Leu Lys Leu Val Asn Lys 
370 375 



<210> 22 

<211> 1170 

<212> DNA 

<213> Arabidopsis sp. 



<400> 22 

atggtgat tg 

gctgtcaa tc 

tacagaaaaa 

gactggtggg 

ggcaaagaac 

tggattctgg 

tccaaattcc 

agaaattggg 

cctcgacctt 

aaagccgcac 

cctcgcacca 

tatgatatga 

aaaggacaac 

gaatcagatg 

ttagacaaac 

cccataaagt 

aagt tcctac 

1020 

ggtctaggta 
1080 

tcgaccccag 
1140 



ctgcagctgt 
tctttcaggc 
t taaccgggt 
c tggagttaa 
atgctcttgt 
ctcagcggtc 
ttccagtcat 
ccaaggatga 
tctggt tagc 
aagagtatgc 
aaggt ttcgt 
cagtgactat 
ct tcagtggt 
acgcaat tgc 
acatagctgc 
ccc ttgcggt 
actgggcaca 

tcatcactct 



catcgtgcct 
agtttgctat 
ggttgcagaa 
gatccaagtg 
cgtttgtaat 
aggttgcc tg 
aggctggtca 
aagcactcta 
cctttttgtg 
agcctcctct 
gtcagctg t 
tccaaaaacc 
gcatgt tcac 
acagtggtgc 
agacact t tc 
ggttctatca 
actctt t tc t 



ttgggccttc 
gtac tcattc 
acc t tgtggt 
tttgctgata 
caccgaagtg 
ggaagcgcat 
atgtggt tct 
aagtcaggtc 
gagggaactc 
gaat tgccta 
agtaatatgc 
tctccaccac 
a tcaagtgtc 
agagatcagt 
cccggtcaac 
tgggcatgcg 
tcatggaaag 



tc ttc t teat 
gaccac tgtc 
tggagcttgt 
atgagacc tt 
atat tgattg 
tagctgtaat 
eggagtatet 
t teageget t 
gc t t tacaga 
tccc tcgaaa 
gttcatttgt 
ccacgatgc t 
actcgatgaa 
t tgtggctaa 
aagaacagaa 
tactaactc t 
gtatcacgat 



ate tggtctc 
taagaacaca 
atggatagt t 
caatcgaatg 
gcttgtggga 
gaagaagtct 
c tttctggaa 
gagegactte 
agecaaac tt 
tgtgttgatt 
cccagcaatt 
aagactat tc 
agacttacct 
ggatgctctg 
cat tggccgt 
tggagcaata 
ateggegett 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



ctgtatgcag atcctgatac gctcgtctca gtcagagcgt 
ecaaagtegt cccagccaag ccaaaagaca atcaccaccc agaatcatcc 
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t:ccaaacag aaacggagaa ggaaaagcaa 
1 17C 

<210> 23 
<21I> 389 
<212> PPT 

<213> Arabidopsis sp. 
<40 0> 2 3 

Met Val lie Ala Ala Ala Val lie Val Pro Leu Gly Leu Leu Phe Phe 
1 5 10 



lie Ser Glv Leu Ala Val Asn Leu Phe Gin Ala Val Cys Tyr Val Leu 
20 25 30 

He Arg Pro Leu Ser Lys Asn Thr Tyr Arg Lys Lie Asn Arg Val Val 
35 40 45 

Ala Glu Thr Leu Trp Leu Glu Leu Val Tro He Val Asd TrD Tro Ala 
50 55 60 

Gly Val Lys He Gin Val Phe Ala Asp Asn Glu Thr Phe Asn Arg Met 
65 70 75 SO 

Gly Lys Glu His Ala Leu Val Val Cys Asn His Arg Ser Asp lie Asp 
85 90 95 

Trp Leu Val Gly Trp lie Leu Ala Gin Arg Ser Gly Cys Leu Gly Ser 

100 105 110 

Ala Leu Ala Val Met Lys Lys Ser Ser Lys Phe Leu Pro Val He Gly 
115 12 0 12 5 

Trp Ser Met Trp Phe Ser Glu Tyr Leu Phe Leu Glu Arg Asn Trp Ala 
13 0 13 5 14 0 

Lys Asp Olu Ser Thr Leu Lys Ser Gly Leu Gin Arg Leu Ser Asp Phe 
145 150 155 160 

Pro Arg Pro Phe Trp Leu Ala Leu Phe Val Glu Gly Thr Arg Phe Thr 
165 170 175 

Glu Ala Lys Leu Lys Ala Ala Gin Glu Tyr Ala Ala Ser Ser Glu Leu 
180 185 190 

Pro He Pro Arg Asn Val Leu He Pro Arg Thr Lys Gly Phe Val Ser 
195 200 2 05 

Ala Val Ser Asn Met Arg Ser Phe Val Pro Ala He Tyr Asp Met Thr 
210 215 220 

Val Thr He Pro Lys Thr Ser Pro Pro Pro Thr Met Leu Arg Leu Phe 
225 230 235 240 

Lys Gly Gin Pro Ser Val Val His Val His He Lys Cys His Ser Met 
245 250 255 

Lys Asp Leu Pro Glu Ser Asp Asp Ala He Ala Gin Trp Cys Arg Asp 
260 265 270 

Gin Phe Val Ala Lys Aso Ala Leu Leu Asp Lys His He Ala Ala Asp 
275 280 2 85 

Thr Phe Pro Gly Gin Gin Glu Gin Asn He Gly Arg Pro He Lys Ser 
290 295 300 

Leu Ala Val Val Leu Ser Tro Ala Cys Val Leu Thr Leu Gly Ala lie 
305 310 315 320 

Lys Phe Leu His Trp Ala Gin Leu Phe Ser Ser Trp Lys Gly He Thr 
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325 330 335 

lie Ser Ala Leu Gly Leu Gly lie lie Thr Leu Cys Met Gin lie Leu 
340 345 350 

lie Arg Ser Ser Gin Ser Glu Arg Ser Thr Pro Ala Lys Val Val Pro 
355 360 365 

Ala Lys Pro Lys Asp Asn His His Pro Glu Ser Ser Ser Gin Thr Glu 
370 375 380 

Thr Glu Lys Glu Lys 
385 

<210> 24 

<211> 269 

<212> DNA 

<213> Glycine max 

<400> 24 

gacccactga acgctctcat caccttcacg tggctcccct tcggcttcat cctctccatc 60 

ataagggtct acttcaacct ccctctccca gaacncattg tccgccacac ctacgagatg 120 

ctcggcacca acctcgtcat ccgcggccac cgccctcctc cgccttcccc cggcaccccc 180 

ggcaacctct acgtctgcaa ccaccgcacc gctctcgacc ccatcgtcat cgccattgcc 240 

cLcggccgca aggtctcctg cgtcaccta 269 

<210> 25 

<211> 242 

<212> DNA 

<213> Glycine max 

<400> 25 

tgatcttcca cgacggccgt ttcgtgcaga ggccagaccc actgaacgct ctcatcacct 60 

tcacgtggct ccccttcggc ttcatcctct ccatcataag ggtctacttc aaccttcctc 120 

tcccagaacg cattgtccgc tacacctacg agatgctcgg catcaacctic gtcatccgcg 180 

gccaccgccc tcctccgcct tcccccggca cccccggcaa cctctacgtc tgcaaccacc 240 
gc 242 

<210> 26 

<211> 272 

<212> DNA 

<213> Glycine max 

<400> 2 6 

gtttgttcaa aggccaactc ctctagcagc cctcttgacc ttcctatggt tgccaattgg 60 

catcatactc tccatnctta agggtctacc ttaacatccc tttgcctgaa agaattgctt 120 

ggtataacta taagctatta ggaatcagag ttattgtgaa gggtacccct ccaccacccc 180 

caaagaaggg tcaaagtggt gtcctatttg tttgtaacca ccgcacagtt ttagaccctg 240 

tggttactgc agttgcactt ggaagaaaaa tt 272 

<210> 27 

<211> 218 

<212> DNA 

<213> Glycine max 

<400> 27 

atagcacagg agggttacat ggtgcccccg agcaaatcag caaaggcagt cccacaggag 60 

cgtctgaaga gcagaatgat cttccacgac gggcgtttcg tgcagaggcc agacccaatg 120 

aatgccctca tcaccttcac atggctccct ttgggtttcg tcctctccat cataagggtc 180 
tacttcaacc tccctctccc agaacgcacc gtccgcta 218 

<210> 28 

<211> 270 

<212> DNA 

<213> Glycine max 

<400> 28 

gtgccrgttg ctgtgaaccg caagcagaac atgttcttrg gaaccaccgt tcgtggcgtc 60 
aagttctggg acccttaact tacttcttac atgaacccta ggcctgtgta cgaggttacc 120 
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tgtt agggt'cgaa :gcaccggg: 240 



cnicctatgg 6( 



t tacc;:gac accti:gccg aggagatgtc gg:taag 
gg:ggccaac cacgtggcag aaggtgc"gg gggatg 
tgactaggaa ggacaagta: atgtcgttgg 

<210> 2 9 

<211> 2 52 

<2I2> DNA 

<213> Glycine rr.ax 

<400> 29 

catgagggca ggtt:gccca aaggccaact cctc^agcrg ccctcttgac 

ctgccaattg gcatcatact ccccatctta agggtctacc ttaacatccc tttgcctgaa 120 

agaattgt^g gtacaaccac aagctcttag gaatcagagt tattgtgaag ggtacccctc 180 

caccgccccc aaagaagggt caaagtggtg tctatttgtt tg:aaccacc gcacagtatt 24C 

agaccccgit gt 252 

<210> 3C 

<211> 272 

<212> DNA 

< 2 1 3 > Glycine max 

< 4 0 0 > 3 0 

c:ggga:tcc cttaaacga: gcatggatct tatcaagaaa ggagcctctg tttttttctt GC 

tccagaggga acacgcagta aagatggaag actaggcaca ttcaagaagg gtgctttcag 12C 

tgttgctgca aagacaaatg caccagtagt accaattacc cttattggaa ctggtcaaat 13C 

catgcctgca ggaaaggagg gaacagtgaa cataggttct gtgaaagtgg ttatacataa 

arctattgtt ggaaaggatc ctgacatgtt at 

<210> 31 

<111> 239 

<212> DNA 

<2I3> Glycine max 

<4 00> 31 

cgggaatcaa ggtcatcaga cttcaagggt gtttcagctg ttgtcactga cagaattcga 6 ) 

gaagctcatc agaatgagtc tgctccatta atgatgttat ttccagaagg tacaaccaca 120 

aatggagagt tcctccttcc attcaagact ggtggttttt tggcaaaggc accggtactt ISO 

cctgtgatat cacgatatca ttaccagaga tttagccctg cctgggattc catatctgg 239 

<210> 3 2 
<211> 242 
<212> DNA 

< 2 1 3 > G 1 y c i ne max 

<400> 32 

gaacggcaac ggcaacagcg ttcgcgatga ccgtcctctg ctgaagccgg agcctccggt 60 

c:tccgccga cagcatcgcc gatatggaga agaagttcgc cgcttacgtc cgccgctacg 120 

tgtacggcac catgggacgc ggcgagttgc ctcccaagga gaagctcttg ctcggtttcg 1&0 

cgttggtcac tcttctcccc attcgagtcg ttctcgccgt caccatattg ctcttttatt 240 

ac 2 42 



2"?; 



<210> 33 

< 2 1 1 > 248 

<212> DNA 

<213> Glycine max 

<4 00> 3 3 

ttcttcttct ctcactctct aaaaccctaa ctctatacat ggaagggaaa nctcaaatct 62 

natgactaat taattaatcc atcgatcaag catggagtcc gaactcaaag acctcaattc 120 

gaagccgccg aacggcaacg gcaacagcgt tcgcgatgac cgtcctctgc tgaagccgga 180 

gcctccggtc tccgccgaca gcatcgccga tatggagaag aagttcgccg cttacgtccg 240 
ccgcgacg 248 

<21C> 3 4 

<211> 217 

<212> DNA 

<213> Glycine max 



<400> 34 
aaaaccctaa 



ztctatacat ggaagggaaa 



:tcaaatct aatgactaat taattaatcc 60 
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atcgatcaag catggagtcc gaactcaaag 
gcaacagcgt tcgcgatgac cgtcctctgc 
gcatcgccga tatggagaag aagttcgccg 



acctcaattc gaagccgccg aacggcaacg 120 
tgaagccgga gcctccggtc tccgccgaca 180 
cttacgt 217 



<210> 35 

<211> 257 

<212> DNA 

<213> Glycine max 



<400> 35 

atctctgtct ctgcatttcc ctccctaaaa 

aaatctaatg actaattaat caatcaatcg 

ccgaactcaa agacctcaat tcgaagccac 

gcgacgaccg tcctctgctg aagccggagc 
tggagaagaa gttcgcc 



ccctaattct acatttggaa aggaaatctc 60 
tattaataat ccatcgatca agtatggagt 120 
ccaactgcaa cggcaacgcc aacagcgttt 180 
ctccggcctc ctccgacagc atcgccgaga 240 

257 



<210> 36 

<211> 284 

<212> DNA 

<213> Glycine max 



<.;co> 3 6 

cccgaccaaa acaggttttt gtggccaatc 

aacagatgac tgcatttgct gttattatgc 

agagcaccat tntggagagt gtagggtgta 

gagaagttgt ggcaaggaaa ttgagggatc 

ttatatttcc tgaaggaact tgtgtaaata 



atacttccat gattgatttc attatcttag 60 

agaagcatcc tggatgggtt ggattattgc 220 

tctggttcaa ccgtacagag gcaaaggatc 180 

atgtcctggg agctaacaac aaccctcttc 240 
atcactactc gtca 284 



<210> 37 

<211> 246 

<212> DNA 

<213> Glycine max 



<4 00> 3 7 

ggagatccgc ataagcaaat caatcatcct 

cctccctaaa accctaattc tacatttgga 

caatcaatcg tattaataat ccatcgatca 

tcgaagccac ccaactgcaa cggcaacgcc 
aagccg 



gttccttcct tatctctgtc tctgcatttc 60 
aaggaantct caaatctaat gataattaat 120 
agtatggagt ccgaactcaa agacctcaat 180 
aacagcgttt gcgacgaccg tcctctgctg 240 

246 



<210> 38 

<211> 278 

<212> DNA 

<213> Glycine max 



<400> 38 

gttttctatt gccacgttgt ggaagcgtaa 
cgtcgagttc tgaattggac cttcacattg 
aacaagaacg gcatggcaag ctccgactgt 
ctgaggcagc acgtgccatt gtagatgata 
agaaccttgg aactggaatg tttatttgtt 



cgaagatgaa tggcattggg aaactcaaat 60 

aagattacct accttctgga tccagtgttc 120 

gtgatttgct agacatttct cctagtctat 180 

cattcacaag gtgcttcaag caaatcctcc 240 
tcctttgt 278 



<210> 39 

<211> 312 

<212> DNA 

<213> Glycine max 



<400> 39 

ttaactttgg cacattctcc rtttgttcat 

cagaggtctt tggtaganat gatgtgcagt 

aagnatcatg gacccaggcc tagcaggaga 

tcatgattga tntcattatn tnagaacaga 

atcctggatg ggttggtaag cntacagnat 
acttgcgtct tc 



caatgtgtgt tgtaaattgt ncatttcctt 60 
ttctgtggtg catcttggac tgnggntgtt 120 
ccaaagcagg tttttgtagc caaccatact 180 
tgactgcttt tgcngttatn atgcagaagc 240 
gtcaacngtg tatnaaatat gntacacnnn 300 

312 



<210> 40 

<211> 255 

<212> DNA 

<213> Glycine max 
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<40 0> 4 0 

ggattattgn ngcanatgca gccarctgtt ctaagataat ganatcna-c atggaag'at 6C 

anaaac c:g: yttttggttg gatactaggt ctcggcccat ggtaccrgac 12; 

catgatgcaa car.aganac: gnacatcatc tccaccaaac ccctc:gana IS: 

ttgagcaatt tagagtaccc cggttcgatg caagtcagta tatccaagct 24( 
aaagg 2 : : 



gatcggncac 
naccccag:c 
ganacgagaa 
tcta:tcatc 



<210> 41 
<2IL> 291 

<2i2> d:;a 

<213> Glycine max 



<400> 41 

caacctccca tgcaatcgct caccctc:cc gtcacccgaa tctgttttct attccctccg 60 

aggatgaatg gcattgggaa actcaaatcg tcgagttctg aattggacct 120 

cttc:ggatc cag~gttcaa caagaacggc atggcaagct ISO 

acatttctcc tagtctiatct gaggcagcac gtgccattgt 240 



tcgcgtaaca 
tca:a;:gaa gattacctgc 
ccgcctgtgt gatttgctag 



agatgataca ttcacaagg: gcrccaagcc aaatcctcca gaaccttgga a 



2 v - 



<210> 42 

<111> 284 

<2.12> DIIA 

<113> Glycine max 



< 4 0 0 > 4 1 
ctg:aa:c t 



a ccatgcaa:: cctcacctga atccgttctc tattgccacg ttgtggaagc 6 ') 



gcaacgaaga tgaatggcat tgggaaactc aaatcgtcga gttctgaatt ggaccttcac 120 
a:tgaaga:: acctaccttc :ggarccagt g::caacaag aacggcatgg caagc:ccga 130 



ctgigtgatr tgctagacat ctctcctagt ctatctgagg cagcacgtgc catgtagatg 2- 



a:aca::aca aggtgctcaa gt:caaatctc cagaaccttg gaat 

<210> 4 3 

<2ii> 268 

<211> di:a 

<21j> Glycine max 



1*4 



< 4 0 0 > 43 

ctgaagtatt ctcgtcctag cccaaagcat agagaaaggn agcaacagaa ctttgctgag 60 

tcagtgctgc ggcgatggga ggaaaagtga tgtgtacctt tatgtggtgt tgttc^taat 1'3 

tattcttagt aatgccattg cttcgacccc ttnttttgct tttgttttgt cattgctaac 180 

tatitatttt taacactttt attaaagata tggcatatat ncacttcagt anacaaagtt 240 



ginccagtaa tttnttttcc aaaaaaaa 



10 3 



<21>"i> 4 4 
<211> 241 

<2ii> di;a 

<113> Glycine max 
<4 0 0> 4 4 

gancaaaatt gccctccatc actttccttg ttagagttgg tttctgcnac ctaccacgca 6 1 

attccctcac ctgaatccgt tttctattgc cacgttgtgg aagcgtaacg aagatgaatg 120 

gcattgggaa actcaaatcg tcgagttctg aattggacct tcacattgaa gattacctac 180 

cttctggatc cagtgttcaa caagaacggc atggcaagct ccgactgtgt gatttgctag 240 
a 241 

<21C> 4? 

<211> 247 

<212> di:a 

<213> Glycine max 



< 4 0 0 > 4 5 
gtaggatgtc 



tgagatccct gccccaatca aaacggtgcg gttaactaga aaccgcgacg G '• 



aggatgcgaa aatgatgaaa aatttgctgg ggcaagggga cctggtggtt tgtcctgaag 



ggaccacatg 
atgagat tgt 
tgganta 

<210> 4 6 
<2il> 271 
<2 12 > DNA 



tagagaacct tatttattga ggttcagccc tctgttctca gagatgtgcg 1 '-■ 0 
ccccgttggc agttgattcc cagttatatig ttccacggaa ccactgctgg 240 

247 
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<213> Glycine max 
<400> 46 

tgcagggggg cttgttagag ccatagtttt ggttcttcta cacccttttg tttgtgtcgt 60 

aggaaaagag atggggttga agataatggt catggcatgc ttcttcggga tcaaagcatc 120 

gagcttcaga gttggaaggt ccgttttgcc cnaattcttc tnggaggacg ttngtgcaga 180 

aatgtttgag gcactcaaaa aaggagggaa gacagtggga gttaccaatt taccccacgt 240 

gatggtggaa agcttcttga gagagtattt g 271 

<210> 47 

<211> 242 

<212> DNA 

<213> Glycine max 

<400> 47 

ttcacagctg tcacgccgtn aacggaaaat ggcaacggcg agacgcagtt tcccgcctat 60 

caccgaatgc aacggaacga cnccgtgcga ntctgtngnc gccgacctcg agggtacgct 120 

cctcatctcc cgtngctcgt tcccgtactt catgctcgtc gccgtcgaag ccggcagcnt 180 

cctccgcggc ctcatgctnc tcctctccct tccgttcgtc atnatcgcct acctcttcat 240 
ct 242 

<210> 48 

<211> 244 

<212> DNA 

<213> Glycine max 

<400> 48 

acatattctt cagttagctc ccccaaccta tacacttcac caccacacca caaccctacc 60 

ctctctctct gtcatggtca ttggaggagc cttccctcgt ttcgacccaa tcaccaaatg 120 

tagacccaag accgctccaa ccagaccatc gcctcggacc tcgatggcac cctccttgtc 180 

tcccggagtg ccttccccta ctacttcctc gtcgccctcg aagccggcag cgtcttccga 240 
gcct 244 

<210> 49 

<211> 230 

<212> DNA 

<213> Glycine max 

<400> 49 

caacattcca cctagctccc caatcacatc ttcaccacac cataaacctt cttaatttct 60 

ctcttcattt tctcctctat tgtcataatc atggggacct tccctcgctt cgacccaatc 120 

accacccaag accggtccaa ccagaccgtg gcctccgacc ttgacggcac cctcctcgtc 180 

tcccggagcg ccttccccta ctacctcctc gttgccctcg aagccggcag 230 

<210> 50 

<21i> 265 

<212> DNA 

<213> Glycine max 

<400> 50 

ctggtgaata atcctaagtt atggagtctg tggtgtgtga gctagaaggc acgcttgtga 60 

aggacaagga tgcgttctca tacttcatgt tggttgcgtt tgaagcttca ggtttggttc 120 

gtttcgcctt gttgctaaca ctattgcccg tgattcggtt ccttgacatg gttggcatga 180 

acgatgcatc tctcaagcta ntnatcttcg tggctgtggc tggtgCtcca aagtccgaga 240 

ttgaatcagt ggctagggca gtttt 265 

<210> 51 

<211> 252 

<212> DNA 

<213> Glycine max 

<400> 51 

ctggtgaata atcctaagtt atggagtctg tggtgtgtga gctagaaggc acgcttgtga 60 

aggacaagga tgcgttctca tacttcatgt tggttgcgtt tgaagcttca ggtttggttc 120 

gtttcgcctt gttgctaaca ctattgcccg tgattcggtt ccttgacatg gttggcatga 180 

acgatgcatc tctcaagcta atgatcttcg tggctgtggc tgggttccaa agtccgagat 240 
tgaatcagtg gc 252 

<210> 52 
<211> 218 
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<212> DMA 

<2 1 3 > Glycine max 

< 4 0 0 > 5 2 

aacrgcaac: acaacaaca: ica.iroa.zic acagctgtca cgccgtgaac ggaaaatggc 60 

aacggcgaga cgcagtttac ccgccta:ac accgaatgca acggaacgac accgtgcgag 120 

tcrg'ggccg ccgaccLcga ccgtacgccc czca.znzccc gtagcncgcc cccgzacztc 130 

atg;tcgt:cg ccgtcgaagc cggcagcccc ctccgcgg 213 

< 2 1 0 > 5 3 

< 2 1 1 > 262 
<112> DNA 

<213> Glycine max 

<4C0> 53 

gc::aaggac at~gagatgg tcgnntcccc ggtgc~gccc aag::c:aca ccgaggacg: 60 

gcnccccgag agccggagag :cttcaatcc ttcgggaagc gttaca^tgt: cactgctag: 120 

ctaggg-gat ggtggagcan tttgttaaga cgtttcttgg ggctgataag gtgcttggga 180 

ctgagc:tga ggccacgaaa tcggggagg: ccatgggit: g~caaggagc ctggtgtgct 240 

tcttggggag cacaagaaag tg 2 62 

< 2 1 j > 5 4 
<211> 212 

< 2 1 2 > DKA 

<11 J> Glycine max 

< 4 0 0 > 5 4 

gcaacracaa caacattcat tcattcacag ctgccacgcc gtgaacggaa aa~ggcaacg 60 

gcgagacgca gctccccgcc taicaccgaa tgcaacggaa cgacgccgtg cgagtccgtg 120 

gccgccgacc tcgacggtac gctcctcatc tcccgtagnc cgtccccgta cttcatgctc 180 

gtngccgtcg aagccggcag cctcctccgc gg 212 

<211> 5 5 

<:i:> 273 

<211> DNA 

<113> Glycine max 

<400> 55 

catggt:."tc tcgagcttcc ttggccccag aaaggacaca ttcagaacag gatcagctgt 60 

tctggcaaag ttcttcttag aagatgttgg attggaaggc tttgaggccg taatatgttg 120 

tgagagaaaa gtggcatcta gtaagttgcc aagggtcatg gttgaaaatt tcctcaagga 180 

ctatttaggg gttgatgctg ttatagcaag agaattgaag tcctttagtg gcttcttttt 240 

gggagttttt gagagtaaga agccaattaa aat 2~3 

<liri> 56 

<211> 2 57 

<212> DNA 

<2 13 > Glycine max 

< 4 0 0 > 5 6 

ctctcaaaaa aggagggaag acagtgggag tcaccaatct accccatgtg atggtggaaa 60 

gcttcttgag agagtatttg gacattgatt tcgttgtggg cagggagctg aaagtitttct 120 

gtggatacta cgtaggattg atggatgaca caaaaactat gcatgccttg gagctggtta 180 

aagaaggaaa aggatgctcc gacatgatcg gaatcacaag gtctcgcaac atacgcgacc 240 

atgatgattt tttctcc 257 

< 2 1 0 > 5 7 
<211> 240 
<212> DNA 

<113> Glycine max 

<400> 57 

gaactaagtg tgaaccacta ccaagaaaca agcttrtaag tccaattatt ttCcatgagg 6C 

gtaggtttgc tcaaaggcca actcctctag ctgnnctctt gaccttccta tggctgccaa 120 

ttggcatcat actctccatc ttaagggtct accttaacat ccctttgcct gaaagaattg 180 

cttggtacaa ctacaagctc ttaggaatca gagtitiat: tgt gaagggtacc cctccaccgc 240 

< 2 1 0 > 5 8 
<211> 2 54 
<212> DNA 
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<213> Glycine max 



<400> 58 

cttggaataa gggtcattag gaagggtatc 

ggagtcctat ttgtatgcaa ccacaggaca 

ttaggaagga aaattagctg tgtcacatat 

ccaatcaaag ctgtggcact ctctagggag 

tngcttgagg aagg 



cctccacccc cagcnaagaa gggccaaagt 60 
gttttagacc ctgtggttac agctgttgca 120 
agcataagca aattcactga aataatttca 180 
agggacaaag atgctgccaa catcaagang 240 

254 



<210> 59 

<211> 267 

<212> DNA 

<213> Glycine max 



<400> 59 

gccaganaga cttgcttggt acaactacaa 

tatccctcca cccccagcaa agaagggcca 

gacagtttta gaccctgtgg ctacagctgt 

atatagcata agcaaattca ctgaaataat 

gagagggacc nagatgctgc cnacatc 



gcttcttgga ataagggtca ttaggaaggg 60 
aagtggagtc ctatttgtat gcaaccacag 120 
tgcattagga aggaaaatta gctgtgtcac 180 
tcaccaatca aagctgtggc actctctagg 240 

267 



<210> 60 

<211> 261 

<212> DNA 

<213> Glycine max 



<400> 60 

gtaaccacag ggtctaaaac tgtgcggtgg 

tgcttatgct atatgtgaca cagctaattc 

gcactctcaa ggganngaga gaaagatgct 

gacttggtga tttgccctga aggcacaact 

cactatttgc cgaactcact g 



ttactgcagt tgcacttgnc nagaaaaatt 60 
actgnaataa tttcaccaat taaagctgtg 120 
gccaatatcc ngagactact tgaggaaggg 180 
tgtagagagc cttcctcttg aggttcagtg 240 

261 



<210> 61 

<211> 258 

< 2 1 2 > DNA 

<213> Glycine max 



<400> 61 

caaggagctc acatgcagtg gagggaaatc 

ggttcttgca gggactttgg gatttgagtg 

catgcttgca ggcacagatg ggacagttcc 

aattaagttc tcccttttga ttattctgta 

gaattatgat agaaataa 



agctattgaa gttgcaaact acattcaaag 60 
cacaaatttg actaggaaga gcaaatatgc 120 
atctaaggag aaggcttgan aagggagaga 18 0 
ttggtgccca atgtgtttcc aaaacactta 240 

258 



<210> 62 

<211> 258 

<212> DNA 

<213> Glycine max 



<400> 62 

attggcataa tcctctccat cctaagggtc 

gcttgntaca actacaagct tcttggaata 

ccagcaaaga agggccaaag tggagcctat 

ctgtggttac agctgttgca ttaggaagga 

aattcactga aataatt t 



tatctcaaca tccctctgcc agaaagactt 60 
agggtcatta ggaagggtat ccctccaccc 120 
ttgtatgcaa ccacaggaca gttttagacc 180 
aaattagctg tgtcacatat agcataagca 240 

258 



<210> 63 

<211> 239 

<212> DNA 

<213> Glycine max 



<400> 63 

cacttcacca ccacaccaca accctaccct ctctctctgt catggtcatt ggaggagcct 60 

tccctcgttt cgacccaatc accaaatgta gcacccaaga ccgctccaac cagaccatcg 120 

cctcggacct cgatggcacc ctccttgtct cccggagtgc cttcccctac tacttcctcg 180 

tcgccctcga agccggcagc gtcttccgag ccctccttct ctnaaccttc gtccccttc 239 

<210> 64 
<211> 531 
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<212> DNA 

<213> Glycine max 



<400> 64 
ccgagaaccg 
c rage gc at - 
t :g:cc:cct 
ggeca t raag 
ggtcgcgtgc 
acc:atacac 
ggagccttcc 
accatcgc -Z 
ttcctccc :g 



gec taaccaa 
tcct cactac 
tgee tccgcc 
tccctga::;: 
tzggrgctgc 
ttcaccacca 
ctegtttega 
cggaccccga 
ccc tcgaagc 



accgtggcc" 
atgctgg teg 
cct t tcgtgt 
tca:cgcc:t 
ccaag: :c"a 
caccacaacc 
cccaa tcacc 
tggcaccctc 
cggcagcgtc 



eggae t tgga 
ccatcgaagc 
a::cacgcac 
cgeggge z tg 
cgccgacata 
ctaccctctc 
aaatgtagca 
c:tgtctccc 
ttccgagccc 



cggcacccnc 
eggcage t tc 
a:a:ncct: zz 
aaggt caggg 
ttc^:cag:t 
tctctgtcat: 
cccaagaccg 
ggagtgcctt: 
tccttctctt 



c'ggtigtccc 60 

ccccgtggcc 120 

ccgagaccgc 180 

aegctgagat 240 

agctccccca 3 00 

ggtcattgga 360 

ctccaaccag 42C 

ccccractac 480 

a 5j 1 



<213> -:5 

<21i> 256 

<212> DNA 

<213> Glycine max 

<400> 65 

a:a:a::c:c cagttagctc ccccaaccta tacacttcac caccacacca caaccctacc 

ctctctctct gtcatggcca t:ggaggagc cttccctcgt t:cgacccaa tcaccaaatg 

tagcacccaa gaccgctcca accagaccat cgcc:cggac ctcgatggca ccc:ccttg: 

c n cccggagt gcczzccccz actacr. tcct cg:cgccctc gaagceggea gcgtcttccg 

a:c:ctcc".t ctctta 



240 
2 ;-6 



< 2 1 j > 6 5 

<211> 250 

<212> DNA 

<213> Glycine max 



cca:ccaaca tattcttcag ttagctcccc caacctatac acntcaccac cacaccacaa 6 0 

ccc":ac:ct: tctctctgt: atggccattg gaggagcett ccctcgtttc gacccaatca 120 

ccaaatgtag cacccaagac cgctccaacc agactaccgc ctcggacctc gacggcaccc 130 

:cc:tg:ct: ccggagtgcc ttcccctact ac'tcctcgt cgccctcgaa gccggcagcg 140 

t':t-.ccgagc cctccttctc 250 

< 2 1 0 > 6 7 
<211> 248 
<212> DNA 

<213> Glycine max 

<400> 67 

caccaaccaa acctcactct ccctttctcc cc:gaccctc tccctgccat ggtratggga 6 0 

gcctttggcc actccgaacc ggrctccaaa tgcagcaccg agaaceggtc taaccaaacc 120 

gtggcctcgg acttggaegg caccctcctg gtgtccccca gcgcatttcc rtactacatg IS 3 

cr.gggcgcca tegaagcegg cagcttcctc cgtggccttg tcccccttgc ctccgtccct 24) 
ticgtgta 24 9 

<210> 63 

<211> 2B3 

<212> DNA 

<213> Glycine max 

<4 0C> 6 3 

ttc: tcccca ccatcacacc aancaaacct cactctncct ggccatggtc atgnnngect t ~) 

ttccgccact tegaaceggt ctccaaatgc agcaccgaaa accggtttaa ccaaaccgtg 12 0 

gcctcggact tggaeggcac cctcctggtg ccccctagcg cctttcctta ctacatgctc Irtl 

gtcgccatcg aagccggcag cttcctccgt ggccttgtcc tccttggatc cc 

gtgtacttca cgtacatatt cttctccgag accgcggcca tea 

<110> 69 
<211> 258 

< 2 1 2 > DNA 

< 2 1 3 > Glycine max 

< 4 0 C > 6 9 

ctcttcttcc ccaccatcnr. accaaccaaa cctcactctc cctgaccatg gtcatgggag 6C 
cctttcgcca cttcgaaccg gtttccaaat gcagcaccga aaaccggttt aaccaaaccg 120 
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tggcctcgga cttggacggc accctcctgg tgtcccctag cgcctttcct tactacatgc 180 
tcgtcgccat cgaagccggc agcttcctcc gnggccttgt: cctccttgga tccgtccctt 240 
tcgtgtactt cacgtaca 2d8 

<210> 70 

<211> 256 

<212> DNA 

< 2 1 3 > Glycine max 

< 4 0 0 > 7 0 

tgcaactaca acaacattca ttcattcaca gctgtcacgc cgtgaacgga aaatggcaac 60 

ggcgagacgc agtttcccgc ctatcaccga atgcaacgga acgacaccgc gcgagtctgt 120 

ggccgccgac ctcgacggta cgctcctcat ctcccgtagc tcgttcccgt acttcatgct 180 

cgtcgccgtc gaagccggca gcntcctccg cggcctcatc ctcctcctng ccantccgtt 240 
cgtcatcanc gcctac 256 

<210> 71 

<211> 259 

<212> DNA 

<213> Glycine max 

<400> 71 

cttccccacc atcacaccan ggcnaacctc antctccctt tctccacnga ccctctccct 60 

gccatngtca tgggancctt tggccacttc gaaccggtct ccaaatgcag caccgagaac 120 

cggnctaacc aaaccgtggc ctcggacttg gacggcaccc tcctggtgtc ccncagcgca 180 

tttccttact acatgctggc ngccatcgaa gccggcagct tcctccgtgg ccttgtcctc 240 

cttgcctccg tccctttcg 259 

<210> 72 

<211> 249 

<212> DNA 

<213> Glycine max 

<400> 72 

ccaacatatt cttcagttag ctcccccaac ccatacactt caccaccaca ccacaaccct 60 

accctctctc tctgtcatgg tcattggagg agccttccct cgtttcgacc caatcaccaa 120 

atgtagcacc caagaccgct ccaaccagac catcgcctcg gacctcgatg gcaccctnct 180 

tgtctcccgg agtgccttcc cctactactt cctcgtcgcc ctcgaagccg gcagcgtctt 240 
ncgagccct 249 

<210> 73 

<211> 257 

<212> DNA 

<213> Glycine max 

<400> 73 

caaccctctt cttccccacc atcacaccaa ncaaacctca ctctcccttt ctcccctgac 60 

cctctccctg ccatggtcat gggagccctt ggccactccg aaccggtctc caaatgcagc 120 

accgagaacc ggtctaacca aaccgtggcc tcggacttgg acggcaccct cctggtgtcc 180 

cccagcgcat ntccttacta catgctggtc gccatcgaag ccggcagctt cctccgtggc 240 
cttgtcctcc ttgcctg -57 

<210> 74 

<211> 255 

<212> DNA 

<213> Glycine max 

<4 00> 7 4 

gccgaagacg tgcacccgga gagttggaga gtgttcaact ctttcgggaa gcgttacatt 60 

gtcacggcta gtcctagggt gatggtggag ccgtttgtta aggcgtttct cggggctgac 120 

aaggtgcttg ggactgaact tgaggccacc aaatcgggga cgttcactgg gtttgt taag 180 

aagcctggtg tgcttgttgg ggagcataag aaagtggctc tggtgaagga gtttcagggt 240 
aattacctga cttgg 255 

<210> 75 

<211> 244 

<212> DNA 

<213> Glycine max 

<400> 75 
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caacaacat: cattca::c2 cagctctcac 

gcaj::rccc gcc:ancacc gaatgcaacg 

acc~cgacgg tacgc:cc:c atcr.cccgta 

:cgaagc:gg cagcc:cc:c cgcggcccca 
gag? 

< 2 I 0 > 7 6 
<211> 243 

<2i2> d:;a 

<Z13> Glycine max 



gccgtgaacg gaaaatggca acggcgagac 
gaacgacacc gtgcgagtct gtggccgccg 220 
gcccgttccc gtacttcatg ctcgtcgccg _ -B 2 
tgcnctcctg cg:t:ar.ttt gagnacccct 242 

244 



<400> 7 6 

gctggccac: ctcttcttcc ccaccatcac 
ggtcatggga gcctttr.cgc cacttcgaac 
::r.accanac cctggccccg gncttggacg 
c'tactaca: gctcgccgcc atcgaagcrg 

<213> 77 

<211> 263 

<212> DIJA 

<113> Glycine max 



accaatcaaa cctcactcta ccctggccat 60 

cggtttccaa atgcagcacc gaanaccgg: 120 

gcaccczccz ggtgtcccct agcgcctttc ISO 

gcagcttcc: ccgtggcttg tcctccttgg 240 



<40!)> 77 

g r . c.cfjggg gctgacaagg tgcttgggac 
cactgggttt gttaagaagc c-ggrgcgct 
gaaggagtit cagggtaat: tacctgactt 
c:tca:g*:ca a:"tgcaagg aagggtacat 
aagaaacaag ctt :taag:c caa 

< 2 1 0 > 7 3 

<211> 258 

<212> DIJA 

< 2 1 3 > Glycir.e max 



ngaacttgag gccaccaaat cggggacgc: 60 
tgttggggag cataagaaag tggctctgg- 120 
gggtctaggt gatagtaaaa gtgattatga 1^0 
ccrtgccaaga actaagtgtg aaccactacc 240 

2 63 



< 4 0 0 > 7 S 

ggccacgaaa tcggggaggt tcactggg:: rgttaaggag cccggtgtgc ttgttgggga 6 1 

gcacaagaaa gtggctgttg tgaaggagtt tcagggtaat ttacctgacr; tgggactagg 120 

agatagtaaa agtgattatg acttcatgtc aatttgcaag gaagggtaca tggtgccaag 180 

gactaagtgt: gaaccactac caagaaacaa acttttaagt ccaattattt ntcatgaggg 240 

taggtttgtt caaaggcc 258 

<210> 7 9 

< 2 1 1 > 2 6 0 
<212> DIJA 

<213> Glycine max 



<4 0 0> 7 9 

ctcttctlcc ccaccatcac accaancaaa cctcactctc cccttctccc ctgaccctct 60 

ccctgccatg gtcatgggag cctttggcca cttcgaaccg gtctccaaat gcagcaccga 120 

gaaccggtct aaccaaaccg tggcctcgga cttggacggc accctcctgg tgtcccccag 160 

cgcatt'.cct tactacatgc tggtcgccat cgaagccggc agcttcctcc gtgggccttg 240 
tcctccttgcctccgtccct 260 

<210> 8 0 

<211> 2 5.7 

<212> DIJA 

<213> Glycine max 



<4 00> 8 0 

gggaacaaca acaaatggca ngaaccttat ctccttccaa cttggtgcat ttatccctgg 6 0 

atacccaatc cagcctgtaa ttgtacgcta tcctcatgtg cactttgacc aatcctgggg 120 

tcatgtnt.ct ttgggaaagc ttatgttcag aatgttcact caatttcaca acttttttga 180 

ggtagaatat cttcctgtca tttatcccct ggatgataag gaaactgctg tancttntcg 140 



ggagaggact ageeggg 



57 



< 2 1 0 > 81 
<211> 272 

< 2 1 2 > DNA 

<213> Glycine max 
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<400> 81 
cataccttt 



gttggcacca ttattagagc aatgcaggtc atatatgtta acagattctt 60 

accatcatca aggaagcagg ctgttaggga aataaaggaa ctgaataaca gagaagggcc 120 

tcttgtgata aatttcctcg agtactatta tttcccgagg gaacaacaac taatggcagg 180 

atccctggat acccaatcca gcctgtaatt 240 

tc 272 



aaccttatct 
atacgctatc 



ccttccaact tggtgcattt 
ctcatgtaca ctttgaccaa 



<210> 82 

<211> 245 

<212> DNA 

<213> Glycine max 

<400> 82 

gggcatttca catactagag tccatcccag tgaaaagaaa gtgggaggct gatgaatcaa 60 

tcatgcgcca tatgctttct acattcaagg atccacaaga tcctctctgg cttgcgcttt 120 

tcccagaagg cactgatttc actgagcaaa agtgccttcg gagtcaaaaa tatgctgctg 180 



tacttccaag gacaaagggg cttctgtgcc 240 

245 



aacataagtL accggttctg aaaaatgt t t 
get tg 

<210> 83 

<211> 268 

<212> DNA 

<213> Glycine max 

<400> 83 

cagtgtcctt cctttctgga caatgttttt ggtgttgacc cttcagaagt gcacctgcat 60 

gtgcggcgta ttccggtgga ggagattcca gcttctgaaa ccaaagctgc ctcttggtta 120 

atcgacacat tccagatcaa ggaccaattg ettteggatt tcaagattca aggecattte 180 

cctaaccaac taaangaaaa tgaaatctct agatttaaga gcctactctc ttttatggtg 240 

atagtttctt ttactgecat gtttattt 268 

<210> 84 

<211> 265 

<212> DNA 

<2lJ> Glycine max 

<400> 84 

gaaagagact gggcaaaaga tgaaacatca ctgaagtcag gttttaggca tctagagcac 60 

atgccattcc ctttctggtt ggcccttttc gttgaaggaa ctcgtttcac gcagacaaag 120 

cttttacaag ctcaagagtt tgetgettea aaagggctgc ctatacctag aaatgttttg 180 

attcctegta ctaagggttt tgtcacagca gnacaaagee tteggecatt tcgttccagc 240 

catttatgat tgcacatatg cagtt 265 



<210> 85 

<211> 265 

<212> DNA 

<213> Glycine max 

<400> 85 

gaaagagact gggcaaaaga tgaaacatca ctgaagtcag gttttaggca tctagagcac 60 

atgccattcc ctttctggtt ggcccttttt gttgaaggaa ctcgtttcac gcagacaaag 120 

cttttacaag ctcaagagtt tgetgettea aaagggctgc ctatacctag aaatgttttg 180 

attcctegta ctaagggttt tgtcacagca gnacaaagee tteggecatt tcgttccagc 240 

catttatgat tgcacatatg cagtt 265 

<210> 86 
<211> 301 
<212> DNA 
< 2 1 3 > Zea mays 



<400> 86 

ctcgtcgtca agggcacccc gccgccgccg cccaagaagg gccacccggg cgtcctcttc 60 

gtctgcaacc accgcaccgt gctcgacccc gtcgaggtgg ccgtggcgct gcgccgcaag 120 

gtcagctgcg tcacctacag catctccaag ttctccgagc tcatctcgcc catcaaggcc 180 

gtcgcgctgt cgegggagge gacaaggacg ccgagaacat ccgccgcctg ctggaggagg 240 

gcgacctggt catctgcccc gagggnaaca actgccgcga gcccttcctg ctgcgttcag 300 
n 301 



<210> 87 
<211> 309 
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<212> DNA 
<213> Zea rr.ays 



< 4 0 C > & 7 

ccctcatgcg g:gtacatca acctgccgct: gcccgagcgc 

gcccatgggc atcaggctcg rcgccaagcg caccccgccg 

cccgggcgtc ctc:tcg:c: gcaacraccg ca:cgtgc:c 

gccgctgcgc cgcaaggcca gccgcgccac c:acagcatc 



at:g:ccac: acacc'acaa 

ccgccgccca agaagggcca 12C 

ga:cccg:cg aggtggccgt lrC 

:c:aag-tct ccgagccca: 240 



ctcccccatc aaggccg:; 
gcctgctgg 



cgc:g-,cggg gargcgacaa ggacgccgag aaca:c:gcc 



J 'J V 

i : c, 



< 2 1 C > 8 8 

<211> 304 

<2 12 > DNA 

<213> Zea mays 



< 4 0 C > & 8 

cggctgtgca ggaggcctac ctggcgacgt caaggaagta 

agctgctgag c:cgc:gatt cgtgcacgac gg:cgcctcg 

gtcgcgcccg t ;acc::cc t ctggatgccg t:cggcttcg 

tacatcaacc tgccgctgcc cgagcgcatc 

aggctcgtcg t:aagggcac cccgccgccg 
tree 



cagcccggtg cccaggaacc 6C 

tg:agcgc:c gacgccgctc 12 0 

cg:tggcg:t catgcg:gtg 1~,C 

g:ctac:aca cc:acaag:t catggg:atc 240 

ccgcccaaga agggccac:c gggcgtcctc 3CC 



<210> 8 9 

<111> 322 

<212> DNA 

< 2 1:> Zea mays 



< 4 0 0 > 8 9 


















ggttcatcca 


cc 


eg 


tgttgc 


ta 






gaccg 


g 


caaagatt tn 


gg 




-aeggt 


ga 




aa 


tc tec 


a 


gagaatctgc 


ct 


:caaa:ag 


ct 


g 


tc 


ttggt 


g 


gataittata 


cc 




tctaac 






agggagg 




tttatgttcc 


c t 


at 


tatagg 


gt 


g 


gg 


:aatg 


t- 


atggacagca 


gg 

















tacegtagg 


agagcacagc 


actancatcg 


e: 


::t:ctaca 


atcttnaggt 


cgaaggaatg 




tctatgttg 


ctaaccat ^a 


gagct tc ttg 


130 


gcttcaaat 


t tataagc aa 


gaccagcatc 


2 4 0 


ate tc ttgg 


gtgtgatt ~c 


tetgeggegt 


3 v C 
312 



<210> 9 0 
<211> 264 
<212> DNA 
< 2 1 3 > Zea mays 



< 4 0 0 > 9 0 
gqtgc tgtat 
ctcttcccct 
ttttcttgea 
tgcagcatgg 
aaattacc ta 



ctgaaagaa: 

gagggcacaa 

aaggcaccag 
gattccat gt 
gaggtggtcc 



ccatcg tget 
c tacaaatgg 
t tcaaccagt 
caggggcacg 
gctt 



catcaacaga 
ggattatctc 
ca ttttgaga 
tcatg tat 1 1 



aaaatgea 
ct tccat tea 
tatcc t taca 
c tgetgetc t 



aa tgatgc ta 
aaacaggtgc 
aaagat t taa 
gtcaat t tgt 



<210> 91 

<211> 212 

<212> DNA 

<213> Zea mays 



<4 0 0> 91 

aaatgtcttg gatgeatttt tgttcagcgg gagtcgaaaa caccagattt caaaggtgtt oO 

tcaggtgctg tatttgaaag aatccatcgt gctcatcaac agaaaaatgc accaatgatg 12 

ctactcttcc ctgagggcac aacta:aaat ggggattatc tccttccatt caaaacaggt 180 

gcttttcttg caaaggcacc agttcaacca gt 



212 



<210> 9 2 

<211> 267 

<212> DNA 

<2 1 3 > Zea mays 



<400> 92 

gtctaaagaa atngaaaggc gcggggnaat tgtgtctaat catgtntctt atgtggatat 

tctttatcan atgtcagcct cttttcctag ttttgttgct aagagatcag tggntagatt 

gectctagtt ggtctcataa gcaaatgtct tggatgear 



aatneanat: 



6 0 
120 



tttgttcagc gggagtnnaa 180 



tcaaaggtgt ttaaggtgtg gnatctgaaa gaatccatcg tgctcatcaa 240 
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cagaaaaatg caccaatga: gctactc 267 



<210> 93 
<211> 152 
<212> DNA 
<213> Zea mays 



<40Q> 93 

ccacaaatgg ggattacctt cttccattta agactggagc ctttnttgca ggtgcaccag 60 

tgcagccagt cattttgaaa tacccttaca ggagatttag tccagcatgg gattcaatgg 120 

atggagcacg tcatgtgtta ttgctgctct gt 152 



<210> 94 
<211> 274 
<212> DNA 
<213> Zea mays 



<400> 94 

aaaatataaa ttaatatggt cttaatccca ccatataaat aacgttctct ttctgcaggg 60 

caatttagtt ctttctaata ttgggctggc agagaagcgc gtgtaccatg cagcactgac 120 

tggtagtagt ctacctggcg ctagacatga gaaagatgat tgaaagacgt tgcgtcgctt 180 

tctctgtaac agacagccga ggaacactta aaaatgtaac tgtgngcgtg tttttatacc 240 
tgtaatgtgg cagtttattt gtttgaggag gctg 274 

<210> 95 

<211> 295 

<212> DNA 

<213> Zea mays 



<400> 95 

aatagctatc aagtacaata aaatatttgt tgatgccttt tggaacagta agaagcaatc 60 

ttttacaatg cacttggtcc ggctgatgac atcatgggct gttgtgtgtg atgtttggta 120 

cttacctcct caatatctga gggagggaga gacggcaatt gcatttgctg agagagtaag 180 

ggacatgata gctgctagag ctggactaaa gaaggttcct tgggatggct atctgaaaca 240 

caaccgtcct agtcccaaac acactgaaga gaacaacgca tattgccgat ctgtc 295 



<210> 96 
<211> 273 
<212> DNA 
<213> Zea mays 



<400> 96 

gngccatctc accggcggcn ggcctgcggc cggcaaccgg aggcgatggc gagctngtct 60 

gtggtggcgg acatggagca ntaccgcccc aacctggagg actacctccc gcccgactcg 120 

ctcccgcagg aggcgcccag gaatctccat ctgcgcgatc tgcttgacat ctcgccggtg 180 

ctaaccgagg cagcgggtgc catagtcgat gattcattca cccgttgctt taagtcgaat 240 

tctccagaac catggaatgg aacatatatt tgt 273 



<210> 97 
<211> 127 
<212> DNA 
<213> Zea mays 



<400> 97 

ctcaatatct ganggaggga gagactgcaa 
tagcagctag agctggtctt aagaaggtcc 
ctagtcc 



ttgcgtttgc tgagagagta agggacatga 60 
cgtgggatgg ctatctgaag cacaaccgcc 120 

127 



<210> 98 
<211> 286 
<212> DNA 
<213> Zea mays 



<400> 98 

gaaccgtacg cgcctcatta cgcccatcca 

nctcggcggc gtcgccatct ccancggcng 

gcgagctcgt ctgtggcggc ggacatggag 

ccgcccgant cgctcccgca ggaggcgacc 

atctcgccgg tgctaaccga ggcagcgggt 



cgtgctcgcc tctccccatc gcataatttt 60 

cnggcctgcn gccggcaacc ggaggcgatg 120 

ctggaccgcc ccaacctgga ggactacntc 180 

aggaatctcc atctgngcga tctgcttgan 240 

gccatagtcg atgatt 286 
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-20 



<210> 99 

<211> 303 

<212> DNA 

<213> Zea ^.ays 

<40 0> 9 9 

cgcca:c"ca tcggcggcgg gcgtgcggcc ggcggcngag gcgaggngcg a~tggcgagc 6 

tcg'ccgtigg cgccggaca: ggagct.ggac cgccca^acc tggaggacta nctcccgccc 

gacccgnncc cgcagaggcg ccccggaatc tccancngcg cga-c^gcng gacatcr.cgc 130 

cggtgctcac cgaggcagcg gg-gccattg tcgatgactic cttcacacgg ngctttaag^ 240 

caaattctcc agagcca'gg aattggaaca tatatc:gtt ccccttatgr gctttggcgt: 3 00 

ataataag 308 

<210> IOC 

<211> 282 

< 2 1 2 > DNA 
<2I3> Zea rr.ays 

<4C0> 100 

cagaaactag ar.gttagtca cagcatggca ttaaattgtc atagnaaaca acancncac: 6- 

gagcaacta: gcaa:ttaat gccatgctgr gactaacttc tagttcctgg cattaaa::a 

ccgtttggct actaggaaga ccgaggtaga gaagcaaata taagaatacc ctccaacgca 130 

canccaaatg acagagtaaa tgaaggtagg gtccaccrtc tcgaacatga ccgtatactg 240 

gttgttaaca caagttcctc tgggaaaatc agagagggtt tt 232 

<210> 101 

<211> 282 

<212> DNA 

<213> Zea mays 

<400> 101 

ggcgcggctg gccgtggcgc tggtcctgcc gtacagtact cgacgccgat cctggcngcg 60 

acnggcatgt cgtggcggct caaagggtng cgcccngngc trgcnnngcc gtgctccggc 120 

gggcgc-gnc agctgttcgt gtgcaacnac cggacgctga tcgacccngt gtacg'gtcc 130 

gtagcgtgga ccgggaaa z g cgcgncgrgt; nctacagnct gangcggntin tcggagctca 240 

tctccccca; ngncggaang tgcacctgan accgggaacg gg 292 

<210> 102 

<211> 29 0 

<212> DNA 

< 2 1 3 > Zea mays 

<400> 102 

ggacgcggca ccatgcgcgc cgagctggcc agtggcgacg tggccgtgtg ccccgagggc CO 

accacgtgcc gggagccctt cctgcrccgc ttctccaagc tcttcgcgga gctcagcgac 120 

aggatcgtgc ccgtggcgat gaactaccgc gtggggctct tccacccgac gacggcgcgc 1 0 

gggtggaaag ccatggaccc catcttcttc ttcatgaacn gcggcccgtg tacgaggtga 240 

cgttcctgaa ccantccccg caaagcgacg tgcgcggcgg ggaagagccc 2'_"0 

<210> 103 

<211> 279 

<212> DNA 

<213> Zea mays 

<400> 103 

acgaggtgac gttcctgaac cagctccccg cagaggcgac gtgcgcggcg gggaagagcc 60 

ccgttgatgt agccaactac gttcagcgga tactcgctgc cacgc:cggg ttcgagtgca 120 

ccaccctcac aaggaaggac aaatacacgg tgctcgccgg caacgacggc gtcctgaacg IPO 

ccaagccggc ggcggcccgg aagccggctt ggcagagccg cgtgaaggaa gtcctcgggt 24 0 

tctgctccac caacaattac acct'gccca gatctggac 2~9 

<210> 104 
<211> 315 

< 2 1 2 > DNA 

< 2 1 3 > Zea mays 

<400> 104 

gcccgagcgc atcgtctact acacctacaa gctcatgggc atcaggctcg tcgtcaaggg 60 

caccccgccg ccgccgccca agaagggcca cccgggcg:c ctcttcgtct gcaaccaccg 120 

caccgtgctc gaccccgtcg aggrggccgt ggcgctgcgc cgcaangtca gctgcgtcac 180 
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tacagcatct ccaagttctc cgagctcatc tcgcccatca aggccgtagc agnaaagcag 240 
gtcgcaaatg gagcagnagc gagtcgatgg aagngaattg gcgactggtc atctgcncga 300 
aggnacactg cggag 315 

<210> 105 

<211> 314 

<212> DNA 

<213> Zea mays 

<400> 105 

cgagacaccg agcacgtact accagcaaga tggtggcgtc tcccagattc aagcccatcg 60 

aggagtgctg ctcggagggg cggtcggagc agacggtggc cgccgacctg gacggcacgc 120 

tgctcatctc caggagcgcg ttcccctact acctcctcgt ggctctcgag gccggcagcg 180 

tcctccgcgc cgcgctgctg ctcctgtccg tgccgttcgt ctacgtcacc tacgccttct 240 

tctccgagtc gctggccatc agcacgctgg tgtacatctc cgtggcgggg ctcaaggtgc 300 

gcanatcgag atgg 314 

<210> 106 

<211> 291 

<212> DNA 

<213> Zea mays 

<4 00> 106 

ctctgggtct ggggccgaga caccgagcac gtactaccag caagatggtg gcgtctccca 60 

gattcaagcc catcgaggag tgctgctcgg aggggcggtc ggagcagacg gtggccgccg 120 

acc-ggacgg cacgctgctc atntccagga gcgcgttccc ctactacctc ctcgtggctc 180 

tcgaggccgg cagcgtcctc cgcgccgcgc tgctgctcct gtccgtgccg ttcgtctacg 240 

tcacctacgc cttcttctcc gagtcgctgg ccatcagcac gctggtgtac a 291 

<210> 107 

<211> 300 

<212> DNA 

<213> Zea mays 

<400> 107 

gcacgcagca gtacgacgtc tctcctctgg gtctggggcc gagacaccga gcacgtacta 60 

ccagcaagat ggtggcgtct cccagattca agcccatcga ggagtgctgc tcggaggggc 120 

ggtcggagca gacggtggcc gccgacctgg acggcacgct gctcatctcc aggagcgcgt 180 

tcccctacta cctcctcgtg gctctcgagg ccggcagcgt cctccgcgcc gcgctgctgc 240 

tcctgtccgt gccgttcgtc tacgtcacct acgccttctt ctccgagtcg ctggccatca 300 

<210> 108 
<211> 284 
<212> DNA 
<213> Zea mays 

<400> 108 

gnggccgaga caccgagcac gtactaccag cangatggtg gcgtctccca gattcangcc 60 

antcgaggag tgctgctcgg aggggcggtc ggagcagacg gtggccgccg acctggacgg 120 

cacgctgctc atctccagga gcgcgttccc ctacnacctc ctcgtggctc tcgaggccgg 180 

cagcgtcctc cgcgccgcgc tgctgctcct gtccgtgccg ttcgtctacg tcactacgcc 240 

ttcttctccg agtcgctggc catcaanacg ctggtgtaca tctc 284 

<210> 109 
<211> 280 
<212> DNA 
<213> Zea mays 

<400> 109 

ctcctctggg tctggggccg agacaccgag cacgtactac cagcaagatg gtggcgtctc 60 

ccagattcaa gcccatcgag gagtgctgct cggaggggcg gtcggagcag acggtggccg 120 

ccgacctgga cggcacgctg ctcatctcca ggagcgcgtt ccnctactac ctcctcgtgg 180 

ctctcgaggc cggcagcgtc ctccgcgccg cgctgctgct cctgtccgtn ccgttcgtct 240 

acgtcaccta cgcnttnttc tccgagtcgc tggccatcag 280 

<210> 110 
<211> 287 
<212> DNA 
<213> Zea mays 
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<4G0> 110 



cgtctctcct 


ccgggtic^gc 


ggccgagaca 


ccgagcacg: 


actaccagca 


agarggtggc 


6 :■ 


gictcccaga 


ticaagccca 


t cgag gag~g 


c c gc ~ cggag 


gggcggtcgg 


ageagaegg w 


120 


ggccgccgac 


cnggacggc a 


g :tcc :catc 


:c:aggagcg 


eg: :ccccta 


c cac ::cc:c 


ISO 


g:ggc^ctcg 


aggccggcag 


cgtcctccgc 


gccgcgctgc 


tgctcc:?:c 


cgtg :cgttc 


i^-i 


gictacgtca 


c^acggcntc 


--c-c :gagt 


cg:cggcca: 


cagcacc 




2^7 



<210> 111 
<211> 286 
<212> DIJA 
<2 1 3 > Zea mays 

<4GC> 111 

cgcacagtta cgacgtctct cctctgggtc tggggecgag acaccgagca cgtactacca 60 

gcaagatgg: ggcgtcccc: agat:caagc ccaccgagga grgctgcncg gaggggeggt: 120 

eggagcagae ggtggccgcc gaccrggacg gcacgctigct catctccagg agcgcgttcc 150 

cctactactc ctcgtgczcz cgaggccggc aggtcctccg cgccgcgctg rgc~cc tgtc 

gtgcgrtcgt etiagtcacta cgct::tc:c gancgtggca a~aana 



— ~t u 
256 



<210> 112 

<211> 3 23 

<212> DNA 

< 2 1 3 > Lea rays 

<4 00> 112 

gttantccct gaaggcacca caacaaatgg gagattcctg acttcgttcc aacaiggtgc t 0 

aUcatacct ggccaccctg t:caa:ctgr tgttgnccgt: tatccacatg tgcactttga 13 0 

tcaatcatgg gggnatatat cgttattaaa gctcatgttt aagatgt:ca cccaatctca 150 

taar.ttcatg gaggtagagt a:cttcctgt tgtctaccct cctgagacca agcaagagaa 240 

tgcccttca: tttgeggagg ataccagcta tgctatggca cgtgccctca a:gtcctgcc 300 

aacctccta: tcatatggtg att 



<110> 113 
<211> 312 
<212> DNA 
<213> Zea mays 

<400> 113 

egataaggee cttttcgaag agcttictiacc greggatcaa cagattcttg gccgagctgc £0 

tgtggcttca gcttgtctgg gtggtggact ggtgggcagg tgttaaggta caactgeatg 120 

cagatgagga aacttacaga tcaatgggta aagagcatgc actcatcata tcaaatcatc 1*0 

ggagtgatat tgattggctc attggatgga tattggecca gcgttcaggg tgccttggaa 240 

gtacacttgc tgtcatgaag aagticarcca 

ggttitgcaga gt 



agttccttcc agttattggc tggtcaatgt 



<210> 114 

<211> 2^9 

<212> DNA 

< 2 1 3 > Zea mays 

<400> 114 

agtggggtct ccaaaggttg 



aaagacttcc ctagaccatt tnggctagc: ctttttgttg 60 
agggtactcg ctttactcca gcaaagcttc tcgcagctca ggagtatgcg gcttcccagg 120 



gcttaccagc tcctagaaa 

gtat^atgcg agattttgtt ccagccattt acgatacaa 
cccctcaacc aacaatgetg eggattttga aagggcaat 



gtacttattc cacgtaccaa gggatttgta tetgeegtaa 1^0 

tgtaatag:: cctaaagatt 240 

279 



<210> 115 
<211> 304 
<212> DNA 
<213> Zea mays 

<400> 115 

cgtcaacg:c atccaggccg 

ccgtcgga:c aacagattct 

ctggtggg:a ggtgttaagg 



tcctatttgt gacgataagg cccttttcga agagcttcta 
tggccgagcc gctigtggct: :: cagctngtct gggtggtgga 
tacaac:gca tgcagatgag gaaacttaca gatcaatggg 



atattggece agegttcagg 
agt: t 



gtgccttgga agtiacattgc tgtcatgaag aagtcatcc 



6 0 



. 1 0 
.HO 



aaagagcit gcactcatca tatcaaatca teggagtgat attgattggc tcatggatgg 240 



3 00 
3 0 4 
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<210> 116 
<211> 259 
<212> DNA 

< 2 13 > Zea mays 

<400> 116 

cttcctcctg tccggcctca tcgtcaacgc catccaggcc gccctatttg tgacgataag 60 

gcccntttcg aagagcttct aacgtcggat caacagattc ntggccgagc tgctgtggct 120 

tcagcttgtc tgggtggtgg acnggtgggc aggtgttaag gtacaactgc atgcngatga 180 

ggaaacttac agatcnatgg gtanagagca tgcactcatc atatcaaatc atcggagtga 240 

tattgattgg cncattgga ^ 9 

<210> 117 
<211> 235 
<212> DNA 
<213> Zea mays 

<400> 117 £n 
attccacgta ccaagggatt tgtatctgct gtaagtatta tgcgagattt tgttccagcc bO 
atttatgata caactgtaat agttcctaaa gattcccctc aaccaacaat gctgcggatt 120 
ttgaaagggc aatcatcagt gatacatgtc cgcatgaaac gtcatgcaat gagtgagatg 180 
ccaaaaticag atgaggatgt ttcaaaatgg tgtaaagaca tttttgtggc aaagg 235 

<210> 118 
<211> 282 
<212> DNA 
<213> Zea mays 

<400> 118 

tgagatgcca aaatcagatg atgacgtttc aaaatggtgt aaagacattt ttgtgacaaa bU 

ggatgcctta ctggacaaac atttggcaac aggcactttc gatgaggaga ttagacctat 120 

cggccgccca gtgaaatcat tgctggtgac cctgttttgg tcgtgcctgc tgttgtttgg ^80 

tgccatcgag ttcttcaagt ggacgcagct cctatcgaca tggagaggag tggcattcac -40 

tgccgcagga tggcgctcgt gacaggggtc atgcacgtct tc 282 

<210> 119 

<211> 166 

<212> DNA 

<213> Zea mays 

<400> 119 

ctggtgggca ggcgttaagg tacaactaca tgcggatgag gacacttacc gatcaatggg bU 

taaagagcat gcactcgtca tatcaaatca tcgaagtgat attgattggc ttattggatg 120 

gatattggcc cagcgctcag ggtgccttgg aagtacgctc gctgtc 166 

<210> 12 0 

<211> 234 

< 2 1 2 > DNA 
<213> Zea mays 

<400> 120 

agtcanccaa gntccttcca gtcattggct ggtcaatgtg gtttgcagag tacctctttt 60 

nggagaggag ctgggccaag gatgaaaaga cactaaagtg gggtctccaa aggttgaaag u.20 

actcccctag accatttngg ctagctcttn tttgtngagg gnantcgctt tactccagca 180 

angnttntng aggnnncagn agnnncgggn ttcccanggg ttaacagncc cana 234 

<210> 121 

<211> 210 

< 2 1 2 > DNA 
<213> Zea mays 

<400> 12 1 

gcgdga:g:n aaaatcagat gatgacgttt caaaatggtg taaagacatt tttgtggaca 60 

aaggatgc-^ tactggacaa acatttggca acaggcactt tcgatgagga gattagacct 120 

aicggccgcc cagtgaaatc atngcnggtg accctgtnnt ggccgtgcct gctgttgttt 180 
ggtgccatcg agntcctcaa gtggacgcag 210 

<210> 12 2 
<21L> 27 4 

< 2 1 2 > DNA 
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<213> Zea mays 



< 4 C 0 > 12 2 

acr.cccgaat 

cacagcagcc 

:c:gacccc: 

cccgctcggc 

at ttg"gaca 



ccgccgcgcg cgcnccgncc ccgtcgccgg cggaggcgcc 

tatcgccgga gaaggaacg: cgcggggagc ttttccacng 

ccgagatcgn aagcggcgg: catggcgatc ccgct:gtgc 

c:cctcttcc tcctgtccgg cc:ca 

at aaggccc t tt::caagag czzg 



cgcnaccgcc 
ccatc-cccg 
ccg:cgtgc: 



60 
12 0 
ISO 



czz aacaccatcc acgccatcct ^ ** o 

274 



<210> 123 
< 2 1 1 > 3 05 
<212> di;a 
<213> Zea mays 

<400> 123 

:tgcactgag gaaaggcca t 



6 0 

120 

180 



lagggacata rcaagtacat acataagagc agcttgatga 

agctgcc-at ctttagctgg gcatctcaca t:t:tgagt: ratcccggta gaacggaaa: 

gggagattga tgaagcaatc a:tcagaaca agctatcaaa attcaagaac ccgagaga^c 

ctatctgc:t ggcggctttt cccgaaggca cggattana: tgagaagaaa tgcatcatga 240 

g-.caagagta tgcttcagaa cacggct^g-: c:a:gctaga acatgtcctc c:tccaaaga 300 
caagg 2 05 



< 2 1 C > 12 4 

<211> 2"9 

<21I> d:ja 

< 2 1 2 > Zea mays 



<40C> 12 4 
ccagatmtc 
atggttcagc 
aggtttaggc 



tggacaatgt 
tccatcacat 
agaaggacca 



gra:ggcgtt gatccrtctg aagtccacat ccacgtcaga 



aaaggaactg aaaggagatc 
tatgc-r.gac ggccnatctg 



6 0 



ccccacaaca gaagacaaga taacagaatg gatggrxgag 120 

gctcctggca ga::tct:ca tgaaggggca cutcctgatg 180 

tgtcgacgcc gag^gccngg caaactttct taaccagtag 240 

gtttigtacct aaacccttt 273 



<210> 125 

<211> 219 

<212> DIJA 

<213> Zea mays 

<400> 125 

agattttntg gacaatgtgt atggngttga tccttntgaa gtncacatcc acgtnagaat 



gtttaggcag aaggaccagc 
aggaac tgaa ggagatctgt 

<210> 12 6 
<211> 293 
<212> DNA 
<213> Zea mays 



cgacgccgaa gtig^ctggc 



60 



ggttcagctc catcacatcc ccacaacagn agacaagata acagaangga tggtagagag 120 

cccggcaga cttrttcatg aaggggcact ttcctgatga 180 



219 



<400> 126 

taccatagat gctgtgtacg acatcacgat cgcntacaaa caccggcngc ngacatttct 

ngacaacgtc tacngcgtgg ntccttcgg-a agtccacatc cacatcanca gcatccaggt 

c r :ccgacata ncggcgtccg aaaaacgggg tggctggcng gntnngtgga gcggttcaag 

gcr.tngar.na acgagctngc tgtccggggc tttctaccgc ggctggggcc aatt:cnccc 

cgaacgaaag ggaaaaaggg gaaccgaagg ggggaacctg ttngaacggg ncc 



6 0 

120 

IftO 

240 

293 



<210> 127 
<211> 6 
<212> PRT 

<213> conserved sequence 



<400> 127 

Val Xaa Asn His Xaa Ser 
1 5 



<210> 128 
<211> 6 
<2 12> PRT 
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<213> conserved sequence 
<400> 128 

Val Thr Tyr Ser Xaa Ser 
1 5 



<210> 129 
<211> 7 
< 2 1 2 > PRT 

<213> conserved sequence 
<400> 129 

Val Xaa Leu Thr Arg Xaa Arg 
1 5 



<210> 130 
<211> 5 
<212> PRT 

<213> conserved sequence 
<400> 130 

Cys Pro Glu Gly Thr 
1 5 



<210> 131 
<211> 5 
<212> PRT 

<213> conserved sequence 
<400> 131 

He Val Pro Val Ala 
1 5 



<210> 132 
<211> 7 
<212> PRT 

<213> conserved sequence 
<400> 132 

Leu Xaa Xaa Gly Asp Leu Val 
1 5 



<210> 133 
<211> 6 
<212> PRT 

<213> conserved sequence 
<400> 133 

Phe Xaa Xaa Gly Ala Phe 
1 5 



<210> 134 
<211> 6 
<212> PRT 

<213> Synthetic Oligonucleotide 
<400> 134 

Val Ala Asn Xaa Xaa Gin 
1 5 



<210> 135 
<211> 30 
<212> DNA 
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<213> Syr.chenc Oligonucleotide 



<400> 135 

ccatccgc:: caagggaacg acaccca:ca 



3 0 



<210> 136 
<211> 31 

<2i2> d:;a 

<213> Synthetic Oligonucleotide 
<400> 13 6 

t ccc tg t gcttgatgaa cttaaagctc g 31 

<210> 137 
<211> 3 0 

<2i2> d:;a 

<213> Synthetic Oligonucleotide 
<400> 1?7 

acagcaggag tgtctgatga tggcagattc 30 

<210> 13 8 

<111> 3; 

<212> DNA 

<213> Synthetic Oligonucleotide 

<400> 133 

actggagttc cagccaaaaa tgcacctgtc 30 

<210> 12 9 

<2ii > 3 : 
<2ii> d::a 

<21~> Synthetic Oligonucleotide 
< 4 0 > 1*9 

gatacac::t tgaaatcagg cgattttgct 30 



<210> 141 
<211> 30 
<212> di:a 

<213> Synthetic Oligonucleotide 
<4 00> 141 

gttttctgct attccagaag gcgtcaacaa 30 

<210> 142 
<211> 31 
<212> di;a 

<113> Synthetic Oligonucleotide 
<400> 142 

cattgaagat ccgtccgtga agttncctta cc 32 

<210> 143 

<111> 3C 

< 1 1 2 > DNA 

<213> Synthetic Oligonucleotide 

<400> 143 

tcgagctgtg atcgatgatt ggctgtgaag 30 

<210> 144 



<210> 14 0 
<211> 3 0 
< 2 1 2 > DI J A 



<213> Synthetic Oligonucleotide 



<4 00> 14 0 

ttgcaaattc aattcctgtt tcaccgggcc 



30 
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<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 144 

gcctcttcaa aaacacacac acacgtctct 30 

<210> 145 
<21I> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 145 

gtctcttcaa aaacacacac acacgtctct 30 

<210> 146 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 146 

gtagagagcc ttacttgctt cggtttagtc 30 

<210> 147 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 147 

acgtcatcgt acctgttgct attgactcac 30 

<210> 148 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 148 

acttttccat tgtcagggac tcctcgacac 30 

<210> 149 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 149 

acggtgtagg aagggaaagg attcaaaagg 3 0 

<210> 150 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 150 

gcgatgaact acagagtcgg attcttcctc 30 

<210> 151 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 151 

ccggtttacg agattacgtt cttgaaccag 30 

<210> 152 

<211> 30 

<212> DNA 

<213> Synthetic Oligonucleotide 



<400> 152 

caatggagac aaggctcgaa agtgctaacc 



30 
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<21C> 153 
<211> 30 
<212> DNA 

<113> Synthetic 01 1 gor.uc I eo t ide 
<4 00> 153 

attctctgaa catagttcgc cacggtcatg 30 

< 1 1 0 > 154 
<211> 3 0 
<212> DHA 

<213> Synthetic Oligonucleotide 
<400> 154 

gaaatccaac gcctt:ccaa tatcactctg 30 

<210> 155 
<211> 3 0 

<2i2> di;a 

<21?> Synthetic Oligonucleotide 
<400> 155 

cttcaact.tt ccatcaggat cttggcacgt 30 

<21C> 15 6 
< 2 1 1 > 3 0 
<212> ONA 

<21j> Synthetic Oligonucleotide 
<4CC> 156 

accacttgtt agagarctta cctgcttagg 30 

<21C> 157 
<211> 3 0 
<21I> DNA 

<213> Synthetic Oligonucleotide 
<40C> 157 

tcctacctac accatccaat ttctcgaccc 30 

<21C> 158 
<211> 3 0 
<212> DNA 

<213> Synthetic C-liconu:leotide 
<400> 158 

ctgcgtcaag tgagcaactc agttcttgca 3C 

<210> 159 
<211> 3 0 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 159 

tgggaagcag cacgttgttc agtatcggaa 3 0 

<210> 160 
<211> 3 0 
<111> DNA 

<2ii> Synthetic Oligonucleotide 
<4C0> 16 0 

tagcctctgt gtaatctgtg ccctcgggga 30 

<210> 161 
<211> 1702 
<112> DNA 

<213> Simmondsia chinensis 
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<400> 161 

gaattctagc ctctctcctc ctgcaatcct acttgctttc tacgatcttt ccctctctct 60 
ctctaaaacc ttaaaattgg aatggaatcg tttaaaaata tgatcttttt gtaattgaat 120 
tagtataatt atatctgggt aatcttgaat ttgttggtga ggccatgggg atcccagctg 180 
cggctgtgat tgtaccgctt ggcttgctct tcttcttctc tggtctcttc atcaacttca 240 
ttcaggcaat ttgttt tgtg ctcgtgcggc cactgtcaaa gnntacatac agaaggatta 300 
acagggtgct ggtggaattg ttgtggcttg agctgatatg gcCcgtagat tggtgggcaa 360 
gtgttaagat caagttgttc acagatcctg atacctttcg gctaatgggt aaagagcatg 420 
cacttgtgat atcaaaccac agaagtgata ttgattggct tgttggatgg gtgttggccc 480 
agagatcagg ctgcctggga agcacactgg ctgtcatgaa gaaatcatca aagtttctcc 540 
cggtcatagg ttggtctatg tggttttctg agtacctttt tcttgagaga agctgggcca 600 
aggatgaaag cacattgaag ttaggtcttc aacgcctcaa ggactaccct ctgcctttct 660 
ggttggctct tttcgtagaa ggaacacgat ttacccaagc taaactttta gcagctcaag 720 
aatatgctac ttcaatggga ttgccagttc ctagaaatac tttgatccct cgtactaagg 780 
gatttgtttc agccgtgagc catatgcgtt cgtttgtccc ggccatatat gatgtaacgg 840 
tggccatccc taaaccttct tcgcagccta caatgctcag acttttcaaa ggccagccat 900 
ccacggttca tgtacacatc aagcgccgct cgatgaaaga tctccctgaa gcagcagatg 960 
atgttgcaca atggtgtcga gacacattcg tcgcaaagga tgcactcctg gacaagcata 
1020 

atgtagatga cactttcgga gatgagtatc tgcaggacac tggccggcct ttgaaatctc 
1080 

tctttgtagc agtctcttgg gcattgattc tcatcctggg aggtttgaaa ttcctacgat 
1140 

ggtcgtccct tctatcatca tggaaggggg tcgccttctc agccgcatgc cttgtgctcg 
1200 

tcaccattct tatgcagatc ttaatccaat tttctcaatc cgagcgctcg actcctgcta 
1260 

aggtagcccc aggaaagccc aagaacatgg tatcagaacc cacggaaacg caacgacata 
1320 

agcagcacta aaagtatata tggaccccaa ctaagaagat tcagacgcaa gccacagttg 
1380 

attcaactgt tcagaatgtc aaatatagtt tgagaaacaa aagatcaaga ttagctgatg 
1440 

aagagcctaa tgaacctaca tacttggatc tgtcgtcgcc accgtctgct gctagctcgt 
1500 

tatcagaact cgtgattccg ggaccgatcc cggatcttag ccttctatgc atggattatg 
1560 

a:agtatctt aaatttcttt aatgatgtac cggaattata atgttagtta attaggggga 
1620 

tgagcattgt ttgggtttat atcgtggtaa atccttgtat tgtttataag atttgaagaa 
1680 

aattcgattc gagtgctctg aa 
1702 

<210> 162 
<211> 387 
<2\2> PRT 

<213> Simmondsia chinensis 
<400> 162 

Met Gly lie Pro Ala Ala Ala Val He Val Pro Leu Gly Leu Leu Phe 
15 10 15 

Phe Phe Ser Gly Leu Phe He Asn Phe He Gin Ala lie Cys Phe Val 
20 25 30 

Leu Val Arg Pro Leu Ser Lys Thr Tyr Arg Arg lie Asn Arg Val Leu 
35 40 45 

Val Glu Leu Leu Tro Leu Glu Leu lie Trp Leu Val Asp Trp Trp Ala 
50 55 60 

Ser Val Lys He Lys Leu Phe Thr Asp Pro Asp Thr Phe Arg Leu Met 
65 70 75 80 

Gly Lys Glu His Ala Leu Val He Ser Asn His Arg Ser Asp He Asp 
85 90 95 

Trp Leu Val Gly Trp Val Leu Ala Gin Arg Ser Gly Cys Leu Gly Ser 
100 105 110 
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Thr Leu Ala Val Met Lys Lys Ser Ser Lys ?he Leu Pro Val He Gly 
115 120 125 

Trp Ser Me: Tip Phe Ser Glu Tyr Leu Phe Leu Glu Arg Ser Trp Ala 
130 135 140 

Lys Asp Glu Ser Thr Leu Lys Leu Gly Leu Gin Arg Leu Lys Asp Tyr 
145 150 155 16C 

Pro Leu Pro Phe Trp Leu Ala Leu Phe Val Glu Gly Thr Arg Phe Thr 
165 170 175 

Gin Ala Lys Leu Leu Ala Ala Gin Glu Tyr Ala Thr Ser Met Gly Leu 
ISO 185 190 

Pro Val Pro Arg Asn Thr Leu lie Pro Arg Thr Lys Gly Phe Val Ser 
195 ~ 200 205 

Ala Val Ser His Met Arg Ser Phe Val Pre Ala lie Tyr Asp Val Thr 
210 215 220 

Val Ala He Pro Lvs Ser Ser Ser Gin Pro Thr Ket Leu Arg Leu Phe 
225 230 235 240 

Lys Gly Gin Pro Ser Thr Val His Val Kis lie Lys Arg Arg Ser Met 
245 250 255 

Lys Asp Leu Pro Glu Ala Ala Asp Asp Val Ala Gin Trp Cys Arg Asp 
250 265 270 

Thr Phe Val Ala Lys Asp Ala Leu Leu Asp Lys His Asn Val Asp Asp 
275 2S0 285 

Thr Phe Gly Asp Glu Tyr Leu Gin Asp Thr Gly Arg Pro Leu Lys Ser 
290 " ' 295 3 0C 

Leu Phe Val Ala Val Ser Trp Ala Leu He Leu lie Leu Gly Gly Leu 
3 05 310 315 320 

Lys Phe Leu Arg Trp Ser Ser Leu Leu Ser Ser Trp Lys Gly Val Ala 
325 330 335 

Phe Ser Ala Ala Cys Leu Val Leu Val Thr He Leu Met Gin lie Leu 
340 345 350 

He Gin Phe Ser Gin Ser Glu Arg Ser Thr Pro Ala Lys Val Ala Pro 
355 360 365 

Gly Lys Pro Lys Asn Met Val Ser Glu Pro Thr Glu Thr Gin Arg His 
370 375 380 

Lys Gin His 
3&5 



<210> 163 
<2il> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
01 igonucieot ide 

<4C0> 163 

aagcttgeat gcgtcgacac aatggttcat gcgaccaagt cag 43 

<210> 164 
<211> 35 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 164 

ggtaccgtcg actcacttct tggtgttgtt gatag 35 

<210> 165 

<211> 44 

<212> DNA 

<213> Artificial Sequence 
<2 20> 

<223> Description of Artificial Sequence : Synthetic 
Ol igonucleot ide 

< 4 0 0 > 165 

ggatccgcgg ccgcacaatg acgagcttta ctacttccct teat 44 

<2L0> 166 

<21i> 38 

< 2 1 2 > DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleot ide 

<400> 166 

ggatcccctg caggttagag atccattgat tetgeaat 38 

<210> 167 

<21i> 38 

< 2 1 2 > DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 167 

ggatccgcgg ccgcataatg gaatcagagc tcaaagat 38 

<210> 168 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthet ic 
Oligonuc leotide 

<400> 168 

ggatcccctg caggtcattc ttctttctga tggaaatc 38 

<210> 169 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence ; Synthetic 
Oligonucleotide 



<400> 169 

ggatccgcgg ccgcacaatg actcgttcac aagatgtttc a 



41 
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<213> 170 

<211> 38 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Syn the t i c 
Oi lgcr.uc iec t ide 

<4GC> 170 

gcatcccctg caggtcactt ctcttccaat ctagccag 38 

<210> 171 

< 2 1 1 > 4 6 
<212> DNA 

<213> Artificial Sequence 

< 2 2 0 > 

<123> Description of Artificial Sequence : Synthetic 
0 1 i 5 o nu c 1 e o t i de 

<400> 171 

gcatccgcgg ccgcacaatg tccggtaata agatctcgac tcttca 46 

<2i'.:> 172 
< 1 1 1 > 4 6 

<;i; > dma 

<113> Artificial Sequence 
<220> 

<222> Description of Artificial Sequence : Synthet i c 
Oligonucleotide 

<40T> 172 

ggatcccctg caggttattt tttcttgaca actccgttat taccgg 46 

<22C:> 173 

<21-> 3 9 

<212> DtJA 

<113> Artificial Sequence 
<22 0> 

< 2 2 3 > Description of Artificial Sequence : Synthetic 

Oligonucleotide 

<4 0 0> 17 3 

atatccgcgg ccgcacaatg gttatggagc aagctggaa 39 

< 2 1 C > 174 

<21i> 38 

<2i2> d:ia 

<223> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<40C> 174 

ggatcccctg caggtcaatg gagacaaggc tcgaaagt 33 

<210> 175 
<211> 42 
<112> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 175 
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ggatccgcgg ccgcacaatg tccgccaaga tttcaatatt cc 42 



<210> 176 
<211> 38 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleotide 

<400> 176 

ggatcccctg caggttaatt tttcttaact actccatt 38 



<210> 177 
<211> 42 
<212> DNA 

<213> Artificial Sequence 



< 2 2 0 > 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleotide 

<4 00> 177 

ggatccgcgg ccgcacaatg ggagctcagg agaaacggcg cc 42 



<210> 178 
<2il> 38 
<212> DNA 

<213> Artificial Sequence 



<2 2 0> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 178 

ggatcccctg caggtcacgt cttctccttc ttcaccgg 38 



<210> 179 
<211> 44 
< 2 1 2 > DNA 

<213> Artificial Sequence 



<220> 

<22?> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 179 

ggatccgcgg ccgcacaatg gcggatcctg atctgtcttc tcct 44 



<210> 180 
<21i> 44 
<212> DNA 

<213> Artificial Sequence 



< 2 2 0 > 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleotide 

<400> 180 

ggatcccctg caggttatgt tggggccaag tcaggtgcaa agat 44 



<210> 181 
<211> 4 4 
<212> DNA 

< 2 1 3 > Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleotide 
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< 4 0 3 > 131 

ggatccgcgg ccgcaaaatg gaaaaaaaga g:guaccaaa ttct 



< 2 1 0 > 


152 


<21 1> 


46 


<212> 


DMA 


< 2 1 j > 


Arti 


<22 :> 




<223> 


Desc 



Oligonucleotide 
<400> 132 

gga'cccctg caggttatct gtttactaat ttgagggaat tttttg 46 

<2i:> 1S3 
<211> 36 

<2i2> d:;a 

<213> Artificial Sequence 

< 2 2 0 > 

<22i> Description of Artificial Sequence : Synthetic 
C 1 igor^c ieot ide 

<4C . > 18 3 

tcgacctgca ggaagcttaa ggatggtgat tgctgc 36 
<:1j> 184 

<2i:> 3i 

<212> DIJA 

<2lJ> Artificial Sequence 

< 2 2 j > 

<2 2~-> rescript ion of Artificial Sequence : Syn the t i c 
C 1 igonucleotide 

<40C> 184 

ggatccgcgg ccgcttactt ctccttctcc g 31 

<210> 185 

<211> 39 

<212> E-iJA 

<213> Artificial Sequence 
<2 2 0> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<4 00> 185 

ggatccgcgg ccgcacaatg tcttttaggg atgtcctag 39 

<210> 185 
<211> 41 

<2i2> d:;a 

<2'1j> Artificial Sequence 
<22C> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

< 4 0 '_■ > 1 R 6 

ggaccccctg caggtcaatc atccttaccc tttggtttac c 41 

< 2 1 C > 19 7 

< 2 1 1 > 6 0 
<212> Dr;A 

<213> Artificial Sequence 
<220> 
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<223> Description of Artificial Sequence : Synthet ic 
Ol igonucleotide 

<400> 187 

atgtctttta gggatgtcct agaaagagga gatgaatttt ctgtgcggta tttcacaccg 60 

<210> 188 

<211> 60 

<2 12> DNA 

< 2 1 3 > Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleotide 

<4 00> 18 8 

tcaatcatcc ttaccctttg gtttaccctc tggaggcaga agattgtact gagagtgcac 60 

<210> 189 
<211> 44 
< 2 1 2 > DNA 

<213> Artificial Sequence 



< 2 2 0 > 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleotide 

<400> 189 

ggatccgcgg ccgcacaatg aagcattccc aaaaataccg tagg 44 



< 2 1 0 > 19 0 
<211> 41 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthet ic 
Oligonucleotide 

<400> 190 

ggatcccctg caggtcaatg attttttttc atcacaaata c 41 



<210> 191 
<211> 60 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleotide 

<400> 191 

atgaagcatt cccaaaaata ccgtaggtat ggaatttatg ctgtgcggta tttcacaccg 60 



<210> 19; 



<211> 
<212> 



60 
DNA 



<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 192 

tcaatgattt tttttcatca caaatacaag aataagaaaa agattgtact gagagtgcac 60 



<210> 193 
<211> 43 
< 2 1 2 > DNA 

<213> Artificial Sequence 
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< 2 2 0 > 

<223> Description cf Artificial Sequence : Synthetic 
Cligcnucle dz ide 

<4:>C> 193 

ggatccgcgg ccgca:aa:g ggttttgttg atttcttcga aac 43 

<21C> 194 
<211> 45 
<Z12> DMA 

<213> Artificial Sequence 
<210> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

< 4 C 0 > 19 4 

ggatcccctg caggttattt ggtctcaatt ttaatatttt tttgc 45 

<21'J> 19 5 
<:.11> 6 0 

<„ii> d:ja 

<213> Artificial Sequence 

< 2 2 0 > 

<213> Description of Artificial Sequence : Syn ^ he ti c 
Ol i gonuc 1 eo t ide 

<4C0> 195 

a-gggttttg ttgatttctt cgaaacatat atggtcggtt ctgtgcggta tttcacaccg 6 0 
<210> 196 

< i : i > 6 0 

<112> DIJA 

<213> Artificial Sequence 

< 2 2 0 > 

<22 3> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<40 0> 196 

ttatttggtc tcaattttaa tatttttttg caaggactcg agattgtact gagagtgcac 6C 

< 2 1 0 > 19 7 
<2 1 1 > 4 4 
<212> DtlA 

<2I3> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Ol igonucleotide 

<4 00> 197 

ggatccgcgg ccgcacaatg gaaaagtaca ccaattggag agac 44 
<L10> 198 

<?.::> 4 2 

<112> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Ol i gonuc lec tide 

<4C0> 198 

ggatcccctg caggctactt cctcttttta ccttgatcgc tg 42 

<210> 199 
<211> 60 
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<212> DNA 

<213> Artificial Sequence 



<220> 

<22 3> Description of Artificial Sequence : Synthet ic 
01 igonucleotide 

<400> 199 

atggaaaagt acaccaattg gagagacaat ggtacgggaa ctgtgcggta tttcacaccg 60 



<210> 200 
<21i> 60 
<212> DNA 

<213> Artificial Sequence 



<2Z0> 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleotide 

<400> 200 

ctacttcctc tttttacgtt gatcgctgat atattccttc agattgtact gagagtgcac 60 



<210> 201 
<211> 41 
<2 12> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 201 

ggatccgcgg ccgcacaatg cctgcaccaa aactcacgga g 41 



<210> 202 
<211> 38 
<22 2> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthet i c 
Oligonucleotide 

<400> 202 

ggatcccctg caggctacgc atctccttct ttcccttc 38 



<210> 203 
<211> 60 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<<;0Q> 203 

atgcctgcac caaaactcac ggagaaatct gcctcttcca ctgtgcggta tttcacaccg 60 



<210> 204 
<221> 60 
< 2 1 2 > DNA 

<213> Artificial Sequence 



<2 2 0> 

<223> Description of Artificial Sequence : Synthet i c 
Oligonucleotide 

<400> 204 

ctacgcatct ccttctttcc cttcttcttc ttcttcctct agattgtact gagagtgcac 60 
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< 2 1 C > 2 0 5 

<2 II> 46 

<212> DMA 

<213> Artificial Sequence 



<2 23> Description or Artificial 
Oligonucleotide 



ieauence : 3^. 



hetic 



< 4 0 '3 > 10 5 

gjat:cc::cgg ccgcacaatg tctgctcccg ctgccgatca taacgc 46 

< 2 1 0 > 206 

< ^ 1 1 > 4 4 
<212> di:a 

<11J> Artificial Sequence 

< 2 2 0 > 

<223> Description of Artificial Sequence : Synthetic 
01 igonuc lec t ide 

< 4 0 0 > 2 0 5 

gjf.:::c:ctg caggtcactc tttcttttcg tgttctcttt tctg 44 

<no> :o7 

< 1 1 1 > "1 v 
<212> DIJA 

<21> Artificial Sequence 



< 2 2 0 > 

<l.ll-> Inscription of Artificial Sequence : Synthet i c 
C'l igonuc leo tide 

< 4 0 0 > 1 C 7 

atctctcctc ccgctgccga tcataacgct gccaaaccta ctgtgcggta tttcacaccg 60 

<21C> 20 S 

< 2 : 1 > 6 c 
<212> DIJA 

<2I3> Artificial Sequence 
<220> 

<2il> Description of Artificial Sequence: Synthetic 
C'l igonuc leo t ide 



<40 0> 208 

tcattctttc ttttcgtgtt ctcttttctg tcttaccagc agattgtact gagagtgcac 60 

<210> 209 

< 2 1 1 > 4 9 
<212> DIJA 

<213> Artificial Sequence 

< 2 2 C > 

<223> Description of Artificial Sequence : Synthet i c 
C'l igonuc leo t ide 



<4t 0> 2 0 9 

ggatccgcgg ccgcacaatg ctgcatcaaa aaatagctca taaagttcg 49 

<21C> 210 
<21I> 49 
<2 2 2> DNA 

<2~3> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthet i c 
C'l igonuc leo tide 



<400> 



210 
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ggatcccctg caggtcaaaa aataaaacaa taaagtttat aaactaacc 49 



<210> 


211 




<211> 


60 




<212> 


DNA 




<213> 


Artif 


icial Sequence 


<220> 






<2 2 3> 


Descr 


iption of Artificial 



Oligonucleotide 
<400> 211 

atgctgcatc aaaaaatagc tcataaagtt cgaaaagtcg ctgtgcggta tttcacaccg 60 

<210> 212 

<21i> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
01 i gonuc 1 eo t ide 

<4 00> 212 

tcaaaaaata aaacaataaa gtttataaac taaccaaatt agattgtact gagagtgcac 60 

< 2 1 0 > 213 

< 21 1> 41 

< 2 1 2 > DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
01 i gonuc leo tide 

<400> 213 

ggatccgcgg ccgcacaatg agtgtgatag gtaggttctt g 41 

<210> 214 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<4 0 0> 214 

ggatcccctg caggttaatg catctttttt acagatgaac c 41 

<210> 215 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
01 i gonuc leo tide 

<400> 215 

atgagtgtga taggtaggtt cttgtattac ttgaggtccg ctgtgcggta tttcacaccg 60 

<210> 216 
<211> 60 
<212> DNA 

<213> Artificial Sequence 



<220> 
<223> 



Description of Artificial Sequence : Synthetic 
Oiigonucieot ide 
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<400> 216 

t caatgca: c 1 1 1 1 :cacag atgaacc::c gttatgggta aga:"gtact gagagrgcac 60 

<210> 217 

<211> 3S1 

<212> PRT 

<213> Saccharomyces sp . 

< Z 2 0 > 

<4 3 0> 217 

Met Ser Phe Arg Asr> Val Leu Glu Arg Gly Asd Glu Phe Leu Glu Ala 
1 5 10 15 

Tyr Pro Arg Arg Ser Pro Leu Trp Arg Phe Leu Ser Tyr Ser Thr Ser 
2 C 2 5 3 C 

Leu Leu Thr Phe Glv Val Ser Lvs Leu Leu Leu Phe Thr Cys Tyr Asr. 
35 4C 45 

Val Lys Leu Asr. Gly Phe Glu Lvs Leu Glu Thr Ala Leu Glu Arg Ser 
50 55 6C 

Lys Arg Glu Asr. Arg Gly Leu Met Thr Val Met Asr. His Met Ser Met 
65 70 75 80 

Val Asp Asp Pro Leu Val Trp Ala Thr Leu Pro Tyr Lvs Leu Phe Thr 
85 90 " 95 

Ser Leu Asp Asn lie Arg Trp Ser Leu Gly Ala His Asr. lie Cys Phe 
10 0 105 110 

Gin Asn Lys Phe Leu Ala Asn Phe Phe Ser Leu Gly Gin Val Leu Ser 
115 120 125 

Thr Glu Arg Phe Gly Val Gly Pro Phe Gin Gly Ser lie Asp Ala Ser 
130 135 140 

lie Arg Leu Leu Ser Pro Asp Asp Thr Leu Asp Leu Glu Trp Thr Pro 
145 150 155 160 

His Ser Glu Val Ser Ser Ser Leu Lys Lys Ala Tyr Ser Pro Pro lie 
165 170 175 

lie Arg Ser Lys Pro Ser Trp Val His Val Tyr Pro Glu Gly Phe Val 
180 185 190 

Leu Gin Leu Tyr Pro Pro Phe Glu Asn Ser Met Arg Tyr Phe Lys Trp 
19 5 2 0 0 2 05 

Gly lie Thr Arg Met lie Leu Glu Ala Thr Lys Pro Pro lie Val Val 
210 215 220 

Pro lie Phe Ala Thr Gly Phe Glu Lys He Ala Ser Glu Ala Val Thr 
225 230 235 240 

Asp Ser Met Phe Arg Gin He Leu Pro Arg Asn Phe Gly Ser Glu lie 
245 250 255 

Asn Val Thr He Gly Asp Pro Leu Asn Asp Asp Leu lie Asp Arg Tyr 
260 26 5 2 70 

Arg Lys Glu Trp Thr His Leu Val Glu Lys Tyr Tyr Asp Pro Lys Asn 
275 230 285 

Pro Asn Asp Leu Ser Asp Glu Leu Lys Tyr Gly Lys Glu Ala Gin Asp 
2 90 2 95 3 00 

Leu Arg Ser Arg Leu Ala Ala Glu Leu Arg Ala His Val Ala Glu lie 
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305 310 315 320 

Arg Asn Glu Val Arg Lys Leu Pro Arg Glu Asp Pro Arg Phe Lys Ser 
325 330 335 

Pro Ser Trp Trp Lys Arg Phe Asn Thr Thr Glu Gly Lys Ser Asp Pro 
340 345 350 

Asp Val Lys Val lie Gly Glu Asn Trp Ala lie Arg Arg Met Gin Lys 
355 360 365 

Phe Leu Pro Pro Glu Gly Lys Pro Lys Gly Lys Asp Asp 
370 375 380 

<210> 218 
<211> 396 
<212> PRT 

<2I3> Saccharomyces sp . 

<220> 

<400> 218 

Met Lys His Ser Gin Lys Tyr Arg Arg Tyr Gly lie Tyr Glu Lys Thr 
15 10 15 

Gly Asn Pro Phe lie Lys Gly Leu Gin Arg Leu Leu lie Ala Cys Leu 
20 25 30 

Phe lie Ser Gly Ser Leu Ser lie Val Val Phe Gin lie Cys Leu Gin 
35 40 45 

Val Leu Leu Pro Trp Ser Lys lie Arg Phe Gin Asn Gly lie Asn Gin 
50 55 60 

Ser Lys Lys Ala Phe He Val Leu Leu Cys Met He Leu Asn Met Val 
65 70 75 80 

Ala Pro Ser Ser Leu Asn Val Thr Phe Glu Thr Ser Arg Pro Leu Lys 
85 90 95 

Asn Ser Ser Asn Ala Lys Pro Cys Phe Arg Phe Lys Asp Arg Ala He 
100 105 110 

He He Ala Asn His Gin Met Tyr Ala Asp Trp He Tyr Leu Trp Trp 
115 120 125 

Leu Ser Phe Val Ser Asn Leu Gly Gly Asn Val Tyr He He Leu Lys 
130 135 140 

Lys Ala Leu Gin Tyr He Pro Leu Leu Gly Phe Gly Met Arg Asn Phe 
145 150 155 160 

Lys Phe He Phe Leu Ser Arg Asn Trp Gin Lys Asp Glu Lys Ala Leu 
165 170 175 

Thr Asn Ser Leu Val Ser Met Asp Leu Asn Ala Arg Cys Lys Gly Pro 
180 185 190 

Leu Thr Asn Tyr Lys Ser Cys Tyr Ser Lys Thr Asn Glu Ser He Ala 
195 200 205 

Ala Tyr Asn Leu He Met Phe Pro Glu Gly Thr Asn Leu Ser Leu Lys 
210 215 220 

Thr Arg Glu Lys Ser Glu Ala Phe Cys Gin Arg Ala His Leu Asp His 
225 230 235 240 

Val Gin Leu Arg His Leu Leu Leu Pro His Ser Lys Gly Leu Lys Phe 
245 250 255 
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Ala Val GIu Lys Leu Ala Pro Ser leu Asp Ala lie :yr Asp Val Thr 
260 265 27 C 

lie Gly Tvr Ser Pro Ala Leu Arg Thr Glu Tyr Val Gly Thr Lys Phe 
275 23 0 285 

Thr Leu Lys Lys lie Phe Leu Met Gly Val Tyr Pro Glu Lys Val Asp 
290 ' 295 300 

Phe Tyr lie Arg Glu Phe Arg Val Asn Glu lie Pro Leu Gin Asp Asp 
3 05 310 315 320 

Glu Val Phe Phe Asn Tru Leu Leu Gly Val Trp Lys Glu Lys Asp Gin 
325 33 0 335 

Leu Leu Glu Asd Tyr Tyr Asn Thr Gly Gin Phe Lvs Ser Asn Ala Lys 
340 3 45 35 0 

Asn Asd Asn Gin Ser lie Val Val Thr Thr Gin Thr Thr Gly Phe Gin 
355 360 365 

H-s Glu Thr Leu Thr Pro Arg lie Leu Ser Tyr Tyr Gly Phe Phe Ala 
3 7 0 3 7 5 3 3 0 

Phe Leu lie Leu Val Phe Val Met: Lys Lys Asn His 
3R5 39 0 395 



<21G> ::i9 

<211> 479 

<212> PRT 

<213> Sac char omyces sp . 
< 2 2 0 > 



<4G0> 219 

Met: Gly Phe Val Asp Phe Phe Glu Thr Tyr Met Val Gly Ser Arg Val 
15 10 15 

Gin Phe Lys Gin Leu Asp lie Ser Asp Trp Leu Ser Leu Thr Pro Arg 
20 25 30 

Leu Leu lie Leu Phe Gly Tyr Phe Tyr Leu His Ser Phe Phe Thr Ala 
35 40 45 

lie Asn Gin Phe Leu Gin Phe lie Asn Thr Asn Ser Phe Cys Leu Arg 
50 55 60 

Leu His Leu Leu Tyr Asp Arg Phe Trp Ser His Val Pro lie lie Gly 
65 70 75 80 

Glu Tyr Lys lie Arg Leu Leu Ser Arg Ala Leu Thr Tyr Ser Lys Leu 
85 90 95 

Lys lie lie Pro Thr Leu Asp Lys Val Leu Glu Ala lie Glu lie Trp 
100 105 110 

Phe Gin Leu His Leu Val Glu Met Thr Phe Glu Lys Lys Lys Asn Val 
115 120 125 

Gin lie Phe lie Thr Glu Gly Ser Asp Asd Leu Asn Phe Phe Lys Asp 
13 0 13 5 140 

Ser Lys Phe Gin Thr Thr Leu Met lie Cys Asn His Arg Ser Val Asn 
145 150 155 160 

Asp Tyr Thr Leu lie Asn Tyr Leu Phe Leu Lys Ser Cys Pro Thr Lys 
165 170 175 
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Phe Tyr Thr Lys Trp Glu Phe Leu Gin Lys Leu Arg Lys Gly Glu Asp 
180 185 190 

Leu Ala Glu Trp Pro Gin Leu Lys Phe Leu Gly Trp Gly Lys Met Phe 
195 200 205 

Asn Phe Pro Arg Leu Asp Leu Leu Lys Asn lie Phe Phe Lys Asp Glu 
210 215 220 

Thr Leu Ala Leu Ser Ser Asn Glu Leu Arg Asp He Leu Glu Arg Gin 
225 230 235 240 

Asn Asn Gin Ala He Thr He Phe Pro Glu Val Asn He Met Ser Leu 
245 250 255 

Glu Leu Ser He He Gin Arg Lys Leu His Gin Asp Phe Pro Phe Val 
260 265 270 

He Asn Phe Tyr Asn Leu Leu Tyr Pro Arg Phe Lys Asn Phe Thr Thr 
275 280 285 

Leu Met Ala Ala Phe Ser Ser He Lys Asn lie Lys Arg Lys Lys Asn 
290 295 300 

Arg Asn Asn lie He Lys Glu Ala Arg Tyr Leu Phe His Arg Glu Leu 
305 310 315 320 

Asp Lys Leu Val His Lys Ser Met Lys Met Glu Ser Ser Lys Val Ser 
325 330 335 

Asp Lys Thr Thr Pro Pro Met He Val Asp Asn Ser Tyr Leu Leu Thr 
340 345 350 

Lys Lys Glu Glu He Ser Ser Gly Lys Pro Lys Val Val Arg He Asn 
355 360 365 

Pro Tyr He Tyr Asp Val Thr He He Tyr Tyr Arg Val Lys Tyr Thr 
370 375 380 

Asp Ser Gly His Asp His Thr Asn Gly Asp Leu Arg Leu His Lys Gly 
385 390 395 400 

Tyr Gin Leu Glu Gin He Ser Pro Thr He Phe Glu Met He Gin Pro 
405 410 415 

Glu Met Glu Ser Glu Asn Asn He Lys Asp Lys Asp Pro He Val Val 
420 425 430 

Met Val Asn Val Lys Lys His Gin He Gin Pro Leu Leu Ala Tyr Asn 
435 440 445 

Asp Glu Ser Leu Glu Lys Trp Leu Glu Asn Arg Trp He Glu Lys Asp 
450 455 460 

Arg Leu He Glu Ser Leu Gin Lys Asn He Lys He Glu Thr Lys 
465 470 475 



<210> 220 
<211> 300 
<212> PRT 

<213> Sacchar omyces sp. 
<400> 220 

Met Glu Lys Tyr Thr Asn Trp Arg Asp Asn Gly Thr Gly He Ala Pro 
15 10 15 



Phe Leu Pro Asn Thr He Arg Lys Pro Ser Lys Val Met Thr Ala Cys 
20 25 30 
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Leu Leu Glv lie Leu Gly Val Lys Thr lie lie Met Leu Pro Leu He 
3 5 4 0 4 5 

Met Leu Tvr Leu Leu Thr 3ly Gin Asn Asn Leu Leu Gly Leu He Leu 
50 " 55 60 

Lys Phe Thr Phe Ser Trp Lys Glu Glu He Thr Val Gin Gly He Lys 
65 70 75 SO 

Lys Arg Asp Val Arg Lys Ser Lys His Tyr Pro Glr. Lys Gly Lys Leu 
85 90 95 

Tyr He Oys Asn Cys Thr Ser Pro Leu Asp Ala. Phe Ser Val Val Leu 
100 105 110 

Leu Ala Gin Gly Pro Val Thr Leu Leu Val Pro Ser Asn Asp He Val 
115 120 125 

Tvr Lys Val Ser He Arg Glu Phe He Asn Phe He Leu Ala Gly Gly 
130 " 135 140 

Leu Asp He Lys Leu Tyr Gly His Glu Val Ala Glu Leu Ser Gin Leu 
145 * 150 155 160 

Gly Asn Thr Val Asn Phe Met Phe Ala Glu Gly Thr Ser Cys Asn Gly 
165 170 175 

Lvs Ser Val Leu Pro Phe Ser He Thr Gly Lys Lys Leu Lys Glu Phe 
ISO 1S5 190 

He Asp Pro Ser He Thr Thr Met Asn Pro Ala Met: Ala Lys Thr Lys 
195 200 205 

Lvs Phe Glu Leu Gin Thr He Gin He Lys Thr Asn Lys Thr Ala He 

210 215 22 0 

Thr Thr Leu Pro He Ser Asn Met Glu Tyr Leu Ser Arg Phe Leu Asn 
225 230 235 240 

Lys Gly lie Asn Val Lys Cys Lys He Asn Glu Pro Gin Val Leu Ser 
245 250 255 

Asd Asn Leu Glu Glu Leu Arg Val Ala Leu Asn Gly Gly Asp Lys Tyr 
260 265 270 

Lvs Leu Val Ser Arg Lys Leu Asp Val Glu Ser Lys Arg Asn Phe Val 
275 280 285 

Lys Glu Tyr He Ser Asp Gin Arg Lys Lys Arg Lys 
290 295 3J0 

<210> 221 
<;.!!> 759 
<212> PRT 

<213> Saccharomyces sp . 
<400> 221 

Met Pro Ala Pro Lys Leu Thr Glu Lys Phe Ala Ser Ser Lys Ser Thr 
15 10 15 

Gin Lys Thr Thr Asn Tyr Ser Ser He Glu Ala Lys Ser Val Lys Thr 

2 0 2 5 3 0 

Ser Ala Asp Gin Ala Tyr He Tyr Gin Glu Pro Ser Ala Thr Lys Lys 
35 40 45 

He Leu Tyr Ser lie Ala Thr Trp Leu Leu Tyr Asn He Phe His Cys 
5 0 55 60 
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Phe Phe Arg Glu lie Arg Gly Arg Gly Ser Phe Lys Val Pro Gin Gin 
65 70 75 80 

Gly Pro Val lie Phe Val Ala Ala Pro His Ala Asn Gin Phe Val Asp 
85 90 95 

Pro Val lie Leu Met Gly Glu Val Lys Lys Ser Val Asn Arg Arg Val 
100 105 110 

Ser Phe Leu lie Ala Glu Ser Ser Leu Lys Gin Pro Pro lie Gly Phe 
115 120 125 

Leu Ala Ser Phe Phe Met Ala lie Gly Val Val Arg Pro Gin Asp Asn 
130 135 140 

Leu Lys Pro Ala Glu Gly Thr lie Arg Val Asp Pro Thr Asp Tyr Lys 
145 150 155 160 

Arg Val lie Gly His Asp Thr His Phe Leu Thr Asp Cys Met Pro Lys 
165 170 175 

Gly Leu lie Gly Leu Pro Lys Ser Met Gly Phe Gly Glu lie Gin Ser 
180 185 190 

lie Glu Ser Asp Thr Ser Leu Thr Leu Arg Lys Glu Phe Lys Met Ala 
195 200 205 

Lys Pro Glu lie Lys Thr Ala Leu Leu Thr Gly Thr Thr Tyr Lys Tyr 
210 215 220 

Ala Ala Lys Val Asp Gin Ser Cys Val Tyr His Arg Val Phe Glu His 
225 230 235 240 

Leu Ala His Asn Asn Cys lie Gly lie Phe Pro Glu Gly Gly Ser His 
245 250 255 

Asp Arg Thr Asn Leu Leu Pro Leu Lys Ala Gly Val Ala lie Met Ala 
260 265 270 

Leu Gly Cys Met Asp Lys His Pro Asp Val Asn Val Lys lie Val Pro 
275 280 285 

Cys Gly Met Asn Tyr Phe His Pro His Lys Phe Arg Ser Arg Ala Val 
290 295 300 

Val Glu Phe Gly Asp Pro lie Glu lie Pro Lys Glu Leu Val Ala Lys 
305 310 315 320 

Tyr His Asn Pro Glu Thr Asn Arg Asd Ala Val Lys Glu Leu Leu Asp 
325 330 335 

Thr lie Ser Lys Gly Leu Gin Ser Val Thr Val Thr Cys Ser Asp Tyr 
340 345 350 

Glu Thr Leu Met Val Val Gin Thr lie Arg Arg Leu Tyr Met Thr Gin 
355 360 365 

Phe Ser Thr Lys Leu Pro Leu Pro Leu lie Val Glu Met Asn Arg Arg 
370 375 380 

Met Val Lys Gly Tyr Glu Phe Tyr Arg Asn Asp Pro Lys lie Ala Asp 
385 390 395 400 

Leu Thr Lys Asp lie Met Ala Tyr Asn Ala Ala Leu Arg His Tyr Asn 
405 410 415 

Leu Pro Asp His Leu Val Glu Glu Ala Lys Val Asn Phe Ala Lys Asn 
420 425 430 
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43 5 

Leu Ala Me: Pre 

4 5 0 

Lvs Arg :ie Ser 
465 

Val Lys lie Lys 



GLy Me: Gly Phe 
503 

Thr Tyr Tyr Leu 
515 

Glv Ser Tyr lie 
53 0 

Gly Asp lie Gly 
5 4 5 

Ser Leu Thr Ser 



A. sr. Leu Ala Glu 
580 

Leu Phe Pro Asp 
595 

lie Asd Glu Glu 
610 



Lys Met Leu Arg 
62 5 

Ser Pro lie lie 



Asr. Gin Asp Ser 

660 



Ser Asr. He Pro 
675 

Ser Leu Ala Ser 
690 

Val Glu Asr. Glu 

705 

Ala Gin Ala Val 



Glu Glu Glu Glu 
740 

Glu GIv Lys Glu 
755 



Phe Phe Arg Ser 
440 

Gly He He Xe: 
455 

Gin Glu Lvs Ala 
470 



Ala Asr. Asd Val 
485 



Ala Pro Leu Leu 



Arg His Lys Pro 
520 

Ser Cvs Val lie 
535 

Met Aso Gly Phe 

550 

Pro Lys Gly Leu 
565 

Arg He He Glu 



Phe Asp Ser Ala 
600 

Glu Glu Asp Arg 
615 

Lys Gin Lys He 
630 

Ser Gin Arg Asp 
645 

Asp Gly Val Ser 



Leu Phe Ser Ser 
680 



Thr Ser Val Ala 
695 

He Leu Glu Glu 
710 



Leu Asr. Lys Arg 
725 



Glu Glu Glu Glu 



Gly Asp Ala 



He Gly Leu Cys 



Phe ser Pro Val 
460 

Arg Thr Ala Leu 
475 

He Ala Thr Trp 
490 

Tyr He Phe Trp 
505 

Trp Asn Lys lie 



Val Thr Tyr Ser 
540 

Lys Ser Leu Arg 
555 

Gin Lvs Leu Gin 

570 

Val Val Asn Asn 
585 

Ala Leu Arg Glu 



Lys Thr Ser Glu 
620 

Lys Arg Gin Glu 
635 

Asn His Asp Ala 
650 

Leu Val Asn Ser 
665 

Thr Phe His Arg 



Pro Ser Ser Ser 
700 

Lys Asn Gly Leu 
715 

He Gly Glu Asn 
730 

Glu Glu Glu Glu 
745 



He Leu Phe Ser 
445 

Phe He Leu Ala 



Ser Lys Ser Thr 
480 

Lys He Leu He 
495 

Ser Val Leu He 
51C 

Tyr Val Phe Ser 
525 

Ala Leu He Val 



Pro Leu Val Leu 
560 



Lys Aso Arg Arg 
575 

Phe Gly Ser Glu 
590 

Glu Phe Asp Val 
6C5 

Leu Asn Arg Arg 



Lys Asp Ser Ser 
640 

Tyr Glu His His 
655 

Asp Asn Ser Leu 
670 

Lys Ser Glu Ser 
685 

Ser Glu Phe Glu 



Ala Ser Lys He 
720 

Thr Ala Arg Glu 
735 

Glu Glu Glu Glu 
750 



<21C> 222 

<211> 743 

<212> PRT 

<213> Sac char omyces sp . 

<400> 222 
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Met Ser Ala Pro 

1 

Val Pro Gin Ala 
20 



Tyr Asn lie His 
35 

He Leu Phe Thr 
50 

Asn Val Pro Glu 
65 

Ala Asn Gin Phe 



Leu Lys Thr Ser 
100 



Thr Ala Glu Ser 
115 

Ala Met Gly Gly 
130 

Val Asp Glu Asn 
145 

Glu He He Lys 



Phe Thr Lys Arg 
180 

Leu Ser Asn Ala 
195 



Leu Ser Ser Pro 
210 

Thr Asn Gly Thr 

225 

Thr Phe Gin Ser 



He Phe Pro Glu 
260 

Lys Ala Gly Val 
275 

Thr Met Lys Val 
290 



Asn Lys Phe Arg 
305 



Val Asp Gly Lys 



Val Ser Lys Leu 
340 

Glu Asn Ala Pro 
355 

Arg Leu Tyr Gin 



Ala Ala Asp His 
5 

Ser Arg Arg Tyr 



Thr Trp Leu Tyr 
40 

He Phe Phe Arg 
55 

Val Gly Val Pro 

70 

lie Asp Pro Ala 
85 

Ala Gly Lys Ser 



Ser Phe Lys Lys 
120 

He Pro Val Pro 
135 

Leu Glu He Tyr 
150 

Gly Arg Ser Lys 
165 

Phe Ser Ala Lys 



Gin He Lys Glu 
200 

Phe Arg Thr Ser 
215 

Asn Phe Lys Tyr 
230 

Val Phe Asp His 
245 

Gly Gly Ser His 



Ala He Met Ala 

280 

Ala Val Val Pro 
295 

Ser Arg Ala Val 
310 

Tyr Gly Glu Met 
325 

Leu Lys Lys He 



Asp Tyr Asp Thr 
3 60 

Pro Val Lys Val 



Asn Ala Ala Lys 
10 

Lys Asn Ser Tyr 
25 

Asp Val Ser Val 



Glu He Lys Val 
60 

Thr He Leu Val 
75 

Leu Val Met Ser 
90 

Arg Ser Arg Met 
105 

Arg Phe He Ser 



Arg He Gin Asp 
140 

Ala Pro Asp Leu 
155 

Asn Pro Gin Thr 
170 

Ser Leu Leu Gly 
185 

He Pro Asp Asp 



Lys Ser Lys Val 
220 

Ala Glu Lys He 
235 

Leu His Thr Lys 
250 

Asp Arg Pro Ser 
265 

Leu Gly Ala Val 



Cys Gly Leu His 
300 

Leu Glu Tyr Gly 
315 

Tyr Lys Asp Ser 
330 

Thr Asn Ser Leu 
345 

Leu Met Val He 



Arg Leu Pro Leu 



Pro He Pro His 
15 

Asn Gly Phe Val 
30 

Phe Leu Phe Asn 
45 

Arg Gly Ala Tyr 



Cys Ala Pro His 
80 

Gin Thr Arg Leu 
95 

Pro Cys Phe Val 
110 

Phe Phe Gly His 
125 

Asn Leu Lys Pro 



Lys Asn His Pro 
160 

Thr Pro Val Asn 
175 

Leu Pro Asp Tyr 
190 

Glu Thr He He 
205 

Val Glu Leu Leu 



Asp Asn Thr Glu 
240 

Gly Cys Val Gly 
255 

Leu Leu Pro He 
270 

Ala Ala Asp Pro 
285 

Tyr Phe His Arg 



Glu Pro He Val 
320 

Pro Arg Glu Thr 
335 

Phe Ser Val Thr 

350 

Gin Ala Ala Arg 
365 

Pro Ala He Val 
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3 7 0 3 7 5 3 5 0 

Glu lie Asn Arg Arg Leu Leu Phe Glv Tvr Ser Lys ?he Lys Asp Asp 
385 390 395 400 

Pre Arg lie lie His Leu Lvs Lys Leu Val Tyr Asp Tyr Asn Arg Lys 
405 410 415 

L eu ^ S p ser Val Gly Leu Lys Asp His Gin Val Met Glr. Leu Lys Thr 
420 425 430 

Thr Lys Leu Glu Ala Leu Arg Cys Phe Val Thr Leu lie Val Arg Leu 
435 440 445 

lie Lys Phe Ser Val Phe Ala He Leu Ser Leu Pro Gly Ser He Leu 
450 455 460 

Phe Thr Pro Tie Phe He He Cys Arg Val Tyr Ser Glu Lys Lys Ala 
465 470 " 475 480 

Lvs Glu Gly Leu Lys Lys Ser Leu Val Lys He Lys Gly Thr Asp Leu 
435 490 495 

Leu Ala Thr Trp Lys Leu lie Val Ala Leu He Leu Ala Pro He Leu 
50 0 505 510 

Tyr Val Thr Tyr Ser He Leu Leu He He Leu Ala Arg Lys Gin His 
515 520 525 

Tyr Cys Arg He Trp Val Pro Ser Asn Asn Ala Phe He Gin Phe Val 
530 535 540 

Tyr Phe Tyr Ala Leu Leu Val Phe Thr Thr Tyr Ser Ser Leu Lys Thr 
545 ' 550 555 560 

Gly Glu He Gly Val Asp Leu Phe Lys Ser Leu Arg Pro Leu Phe Val 
565 570 575 

Ser He Val Tyr Pro Gly Lys Lys He Glu Glu He Gin Thr Thr Arg 
580 585 590 

Lys Asn Leu Ser Leu Glu Leu Thr Ala Val Cys Asn Asp Leu Gly Pro 
595 6 00 605 

Leu Val Phe Pro Asp Tyr Asp Lys Leu Ala Thr Glu He Phe Ser Lys 
610 " 615 620 

Arg Asp Gly Tyr Asp Val Ser Ser Asp Ala Glu Ser Ser He Ser Arg 
625 ~ 630 635 640 

Met Ser Val Gin Ser Arg Ser Arg Ser Ser Ser He His Ser lie Gly 
645 650 655 

Ser Leu Ala Ser Asn Ala Leu Ser Arg Val Asn Ser Arg Gly Ser Leu 
660 665 670 

Thr Asp lie Pro He Phe Ser Asp Ala Lys Gin Gly Gin Trp Lys Ser 
67 5 6&0 685 

Glu Gly Glu Thr Ser Glu Asp Glu Asp Glu Phe Asp Glu Lys Asn Pre 
690 695 70 0 

Ala lie Val Gin Thr Ala Arg Ser Ser Asp Leu Asn Lys Glu Asn Ser 
705 710 715 723 

Ara Asn Thr Asn He Ser Ser Lys He Ala Ser Leu Val Arg Gin Lys 
725 730 735 

Arg Glu His Glu Lys Lys Glu 
740 
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<210> 223 
<211> 397 
<212> PRT 

<213> Saccharomyces sp . 
<400> 223 

Met Leu His Gin Lys lie Ala His Lys Val Arg Lys Val Val Val Pro 
15 10 15 

Gly lie Ser Leu Leu He Phe Phe Gin Gly Cys Leu He Leu Leu Phe 
20 25 30 

Leu Gin Leu Thr Tyr Lys Thr Leu Tyr Cys Arg Asn Asp lie Arg Lys 
35 40 45 

Gin lie Gly Leu Asn Lys Thr Lys Arg Leu Phe He Val Leu Val Ser 
50 55 60 

Ser He Leu His Val Val Ala Pro Ser Ala Val Arg He Thr Thr Glu 
65 70 75 30 

Asn Ser Ser Val Pro Lys Gly Thr Phe Phe Leu Asp Leu Lys Lys Lys 
85 90 95 

Arg He Leu Ser His Leu Lys Ser Asn Ser Val Ala He Cys Asn His 
100 105 110 

Gin He Tyr Thr Asp Trp He Phe Leu Trp Trp Leu Ala Tyr Thr Ser 
115 120 125 

Asn Leu Gly Ala Asn Val Phe He He Leu Lys Lys Ser Leu Ala Ser 
130 135 140 

He Pro He Leu Gly Phe Gly Met Arg Asn Tyr Asn Phe He Phe Met 
145 150 155 160 

Ser Arg Lys Trp Ala Gin Asp Lys He Thr Leu Ser Asn Ser Leu Ala 
165 170 175 

Gly Leu Asp Ser Asn Ala Arg Gly Ala Gly Ser Leu Ala Gly Lys Ser 
180 185 190 

Pro Glu Arg He Thr Glu Glu Gly Glu Ser He Trp Asn Pro Glu Val 
195 200 205 

He Asp Pro Lys Gin He His Trp Pro Tyr Asn Leu He Leu Phe Pro 
210 215 220 

Glu Gly Thr Asn Leu Ser Ala Asp Thr Arg Gin Lys Ser Ala Lys Tyr 
225 230 235 240 

Ala Ala Lys He Gly Lys Lys Pro Phe Lys Asn Val Leu Leu Pro His 
245 250 255 

Ser Thr Gly Leu Arg Tyr Ser Leu Gin Lys Leu Lys Pro Ser He Glu 
260 265 270 

Ser Leu Tyr Asp He Thr He Gly Tyr Ser Gly Val Lys Gin Glu Glu 
275 280 285 

Tyr Gly Glu Leu He Tyr Gly Leu Lys Ser He Phe Leu Glu Gly Lys 
290 295 300 

Tyr Pro Lys Leu Val Asp He His He Arg Ala Phe Asp Val Lys Asp 
305 310 315 320 

He Pro Leu Glu Asp Glu Asn Glu Phe Ser Glu Trp Leu Tyr Lys He 
325 330 335 
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Tro Ser Glu Lys Asp Ala Leu >!e: Glu Arg Tyr Tyr Ser Thr Giy Ser 
340 345 35 0 

Phe Val Ser Asp Pro Glu Thr Asn His Ser Val Thr Asp Ser Phe Lys 
355 36C 365 

lie A sr. Arg lie Glu Leu Thr Glu Val Leu Tie Leu Pro Thr Leu Thr 
370 375 380 

lie lie Trp Leu Val Tyr Lys Leu Tyr Cys Phe lie Phe 
385 390 395 



<210> 224 
<211> 303 
< 2 1 2 > PRT 

< 1 1 3 > Saccharomyces sp . 



<4 00> 22 4 

Ket Ser Val lie Gly Arg Phe Leu Tyr Tyr Leu Arg Ser Val Leu Val 
15 10 15 

Val Leu Ala Leu Ala Gly Cys Gly Phe Tyr Gly Val lie Ala Ser lie 

2 0 2 5 3 0 

Leu Cys Thr Leu lie Gly Lys Gin Hrs Leu Ala Gin Trp He Thr Ala 
35 40 45 



Arg Cys Phe Tyr His Val Met: Lys 
5 0 55 

Val Val Gly Glu Glu Asn Leu Ala 

tf 5 70 

Asn His Gin Ser Thr Leu Asp He 

85 

Pro Gly Cys Thr Val Thr Ala Lys 
100 



Leu Ket Leu Gly Leu Asp Val Lys 
60 

Lys Lys Pro Tyr He Met: He Ala 

75 80 

Phe Met Leu Gly Arg He Phe Pro 
90 95 

Lys Ser Leu Lys Tyr Val Pro Phe 
105 110 



Leu Gly Trp Phe Met Ala Leu Ser Gly Thr Tyr Phe Leu Asp Arg Ser 
115 120 125 

Lys Arg Gin Glu Ala He Asp Thr Leu Asn Lys Gly Leu Glu Asn Val 
130 135 140 

Lys Lys Asn Lys Arg Ala Leu Trp Val Phe Pro Glu Gly Thr Arg Ser 
145 150 155 160 

Tyr Thr Ser Glu Leu Thr Met Leu Pro Phe Lys Lys Gly Ala Phe His 
165 170 175 

Leu Ala Gin Gin Gly Lys He Pro He Val Pro Val Val Val Ser Asn 
180 185 190 



Thr Ser Thr Leu Val Ser Pro Lys 

195 200 

Mc-t He Val Arg He Leu Lys Pro 

210 215 

Asd Lys He Gly Glu Phe Ala Glu 

225 230 



Tyr Gly Val Phe Asn Arg Giy Cys 
205 

He Ser Thr Glu Asn Leu Thr Lys 
220 

Lys Val Arg Asp Gin Met Val Asp 
235 240 



Thr Leu Lys Glu 

Pro Pro Gin Ala 

260 



He Gly Tyr Ser 
245 

He Glu Tyr Ala 



Pro Ala lie Asn 
250 

Ala Leu Gin His 
265 



Asp Thr Thr Leu 
255 



Asp Lys Lys Val 

270 
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Asn Lys Lys lie Lys Asn Glu Pro Val Pro Ser Val Ser lie Ser Asn 

275 280 285 

Asp Val Asn Thr His Asn Glu Gly Ser Ser Val Lys Lys Met His 

290 295 300 



<210> 225 
<211> 1146 
<212> DNA 

<213> Saccharomyces sp . 



<400> 225 

atgtctttta 

agccccct tt 

ctgcttcttt 

ttggaacgtt 

gtcgatgatc 

ataagatggt 

ttctcacttg 

acagatgctt 

cactctgagg 

ccatcttggg 

aattcgatga 

cccat tgtag 

gattcaatgt 

ggggatcctt 

gaaaaatact 

gaggcgcaag 

agaaatgaag 

1020 

aagcggttca 
1080 

tgggcaataa 
1140 
gattga 
1146 



gggatgtcct 
ggagatttct 
tcacatgcta 
ccaaaaggga 
cgttagt t tg 
ct t tgggtgc 
gccaagtcct 
caataagat t 
tctcttcttc 
tccatgttta 
ggtattttaa 
taccaatatt 
t tagacaaa t 
taaatgatga 
atga tcccaa 
atttaagaag 
ttcgcaaatt 



agaaagagga 
ttcatacagt 
taatgtcaaa 
aaatagaggc 
ggcaacacta 
acataatatt 
t tcaacagaa 
gt taagccct 
gctaaaaaaa 
tccagaagga 
atggggtatt 
tgc tacaggg 
tctaccaaga 
tttaatcgac 
aaatcctaac 
cagattagcc 
accacgcgaa 



gatgaattt t 
acatcattac 
t tgaatggtt 
c ttatgacgg 
ccatataagt 
tgct t tcaaa 
agatttgggg 
gacgacact t 
gcctactccc 
t t tgtactac 
accagaatga 
t ttgaaaaaa 
aactttggct 
aggtatagaa 
gacctctctg 
gctgaactga 
gaccctaggt 



tagaagcc ta 
tgaccttcgg 
ttgaaaaatt 
tcatgaacca 
tat t tacgtc 
ataaatttct 
tgggcccatt 
tagac t tgga 
cgcccataat 
aat tatatcc 
tcctagaagc 
tagcatccga 
c tgaaataaa 
aagaatggac 
acgaat tgaa 
gagcccatgt 
tcaaatcccc 



tcccagaaga 
tgtatcaaaa 
agaaactgcc 
tatgagtatg 
tt tggacaac 
ggccaactt t 
tcaaggt tct 
atggacccct 
aaggtcgaag 
gccttttgaa 
aacaaagccg 
agcagtcaca 
tgttaccata 
acatttggtt 
atatggtaaa 
tgctgaaat t 
ctcatggtgg 



acaccacgga aggtaaatcg gacccagatg ttaaagtcat tggcgaaaat 
ggaggatgca aaagtctctg cctccagagg gtaaaccaaa gggtaaggat 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



<210> 226 
<211> 1191 
<212> DNA 

<213> Saccharomyces sp . 



<400> 226 

atgaagcatt 

ataaaagggt 

gtcgtttttc 

ggtataaatc 

gctccctctt 

gccaagccat 

gcagactgga 

atcatcctga 

aagtttatat 

gtttctatgg 

tccaagacaa 

ctaagcctca 

gtccaattaa 

ctagctcc ta 

acggaatacg 

gagaaagtag 

gaagtt ttt t 

1020 

tac tacaaca 
1080 

acgacacaaa 
1140 

gggttcttcg 
1191 



cccaaaaata 
tgcaaaggct 
agatctgtc t 
aaagtaagaa 
ctttgaatgt 
get t tagatt 
tttatctctg 
agaaagc tct 
ttttaagtag 
ac ttaaaege 
atgaatccat 
agacaagaga 
gacatttgtt 
gtt tagatgc 
tcggcaccaa 
atttttatat 
tcaattggt t 

caggecaat t 

cgac tggatt 

cttttcttat 



cegtaggtat 
gc t tatcget 
acaggtgc tt 
ggcttttatc 
cacttttgaa 
taaagacagg 
gtggctttcc 
gcagtacata 
gaactggcaa 
gaggtgcaag 
tgecget tat 
aaaaagegag 
at taccgcac 
tatctacgat 
at tcacc t tg 
tagggaat tt 
actgggcgtg 



ggaat ttatg 
tgcttgttca 
c tcccttgga 
gttttattat 
acatcgcggc 
gctataataa 
tttgtttcaa 
ccattactgg 
aaggatgaga 
gggcccctta 
aatttaatca 
gcattctgtc 
tctaaaggct 
gtcactattg 
aagaaaatat 
agagttaatg 
tggaaagaaa 



aaaagactgg 
tttcaggc tc 
gcaagattag 
gcatgatc tt 
cattgaagaa 
ttgeaaatea 
atttgggtgg 
gatt tggcat 
aagctttaac 
caaattataa 
tgttccc tga 
aaagagcaca 
tgaagtttgc 
gatattctcc 
tct taatggg 
agatccct t t 
aagatcaact 



taatcccttt 
gctgagtatt 
atttcaaaat 
gaacatggtg 
ctcttctaac 
tcaaatgtat 
taacgtttat 
gcgaaattt t 
aaatagtttg 
gagttgttat 
gggtacaaat 
tttggaccat 
agtagaaaaa 
cgecttgaga 
tgtctatccg 
gcaagatgac 
gctagaagac 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



taaaagtaat gctaaaaatg acaaccaatc catcgttgtt 
tcagcacgaa acattgacac cccgtatcct ttcatattac 
tcttgtattt gtgatgaaaa aaaatcattg a 
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< 2 1 0 > 227 
<2I1> 144C 
<212> DNA 

<213> Saccharoryces sp . 
<4CC> 227 

a*_gggrt"cg c t gar. t tctt cgaaacata: acggucgg:" ctagggtcca gt:caaacag 60 

tcagacattt ctgatitggtt gagr.ccgacc ccaagg::gc tra::ct::t :ggctat :tt 120 

taccttcatt ctttttctac :gcaatcaa: caaticc:ac agttcattaa cacgaartcc 180 

f.crgtCuta gactgcattt actatatgac aga:tt"ggt: cgca:gtgcc ca:aa:aggt 243 

gagtacaaaa ctcggctgct c:cgagggca c:gaca;a;a gtaaactgaa aataatacca 300 

ac tctagaca aggtgctgga ggcgatcgaa at:tgg""tc agctacat:r agntgaaatg 360 

accttcgaaa aaaaaaaaaa cg:ccaaatt ctcacaaccg agggaagtga tgacctaaac 420 

rtt:ttaaag aiagcaaatt ccaaaccaca t:aatgata: graatcatcg accagcgaat 480 

gac:a:ac:a- tgancaatta ccctnttctc aaaagttg:c ccaccaagtt tta:actaaa 540 

tgggaattrc tacaaaagct gaggaagggg gaagatccag ccgaatggcc tcagtnaaaa 600 

tttcttggtt ggggaaaaa: gtttaacttt ccccga^ngg anctactaaa gaacatattc 660 

ttcaaagacg aaaca: tcgc actctcatcg aa cgag t c aa gagacatttt agaaagacaa 720 

aacaaccaag ctattactat ttttcccgaa gtcaatatca tgagnttgga actatcaatt 780 

ar :caaagaa aatta ;acca agattttccc cttgttataa acttctataa cctattatac 840 

ccaaga:f.ta aaaacttcac cactttgatg gctgcttttc carcaattaa aaacatcaaa 900 

agaaagaaaa accg:aacaa ;a:aatcaaa gaggcccgat acctgtttca cagagaactt 960 

gacaaattag ttcacaagag cacgaaaatg gagtcttc:a aggratccga taagacgacg 
1 C 2 0 

ccgcceatga tcgtagataa ttcatactta cttacaaaaa aggaagaaat cagcagcggc 



;80 

aagcccaagg tggtacgaat caatccatac atatatgatg tcaccataat ttattaccga 
114 0 

gtcaaatata ctga^agtgg gcatgatcat accaacggag atttgagact tcataaaggt 

12 0 0 

tatcaattag agcaaatatc tccgacaatc tttgagatga rtcaaccaga aatggagtct 
1260 

gaaaacaaca taaaggataa ggaccccatt gttgtgatgg taaatgtaaa aaagcatcaa 

13 2 0 

at.-.caaccat tactcgcata caatgatgag agtttagaaa agrggcttga aaataggtgg 
13 8 0 

atagaaaaag acagattaat cgagtccti-g caaaaaaata ttaaaatrga gaccaaataa 
144 0 

<210> 22 8 
<211> 9C3 
<212> DNA 

<213> Saccharomyces sp. 
<400> 228 

atggaaaagt acaccaattg gagagacaat ggtacgggaa tagctccatt tctaccaaac 60 
acaatcagga aacctagtaa ggtigatgaca gcgtgtttgt tgggtatcct aggggtgaaa 120 
accattataa tgctaccatt gattatgctg taccttctaa ctiggccagaa caacttactg 180 
gg^ttgatat tgaagtttac attcagttgg aaagaggaaa ttaccgtgca aggaatcaag 240 
aaacgtgacg taaggaaatc caagcatitat ccacagaagg gcaagcttta tatttgcaat 3C0 
tgtacctcac ctttagatgc tttttcagtg gtgttattag ctcaagggcc tgttacgttg 360 
ttggtcccat ccaatgacat tgtatacaaa gtttccataa gagaattcac caacttcatc 420 
ctcgccggtg ggttagatat aaaactctat ggccacgagg tagcagagct atctcaattg 480 
ggcaataccg tgaattttat gtttgctgag ggtacctcat gtaatggtaa aagcgtctta 540 
ccgtntiagca taaccgggaa aaaacttaaa gaattcatag acccctcaat aaccacaatg 600 
aaccccgcaa tggccaaaac naaaaaa t t t gaattgcaga ccatccaaat caaaactaat 660 
aaaactgcca ccaccacatt gcccatictcc aatatggagt acttatctag atttctgaac 720 
aagggcatta atgttaaatg caagatcaac gagccacaag tactctcgga taatttagag 780 
gaattacgcg ttgcatraaa cggtiggcgac aaanataaac tagtctcacg gaagttagat 840 
gttgaatcta agaggaattt tgtgaaggaa tatatcagcg atcaacgtaa aaagaggaag 900 
tag 9C3 

<220> 229 
<211> 2280 
<212> di;a 

<213> Saccharoryces sp . 
<40C> 229 

atgcctgcac caaaactcac ggagaaattt gcctcttcca agagcacaca gaaaactacg 60 
aactacagtt ccatcgaggc caaaagcgtc aagacgtcgg ctgaccaggc atacatctac 120 
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caagagccta gcgctaccaa gaagatactt tactccatcg ccacatggct gttgtacaac 180 
accttccact gcttctttag agaaatcaga ggccggggca gtttcaaggt accgcaacag 240 
ggaccggtga tctttgttgc ggctccgcat gctaaccagt tcgtcgaccc tgtaatcctt 300 
atgggcgagg tgaagaaatc tgtcaacaga cgtgtgtcct tcctgattgc ggagagctca 360 
ttaaagcaac cccccatagg gtttttggct agtttcttca tggccatagg cgtggtaagg 420 
ccgcaggata atttgaaacc ggcagaaggt actatccgcg tagatccaac agactacaag 480 
agagttatcg gccacgacac gcatttcttg actgattgta tgccaaaggg tctcatcggg 540 
ttacccaaat caatgggatt tggagaaatc cagtccatag aaagtgacac gagtttgacc 600 
ctaagaaaag agttcaaaat ggccaaacca gagattaaaa ctgctttact caccggcact 660 
acttataaat atgccgctaa agtcgaccaa tcttgcgttt accatagagt ttttgagcat 720 
ttggcccata acaactgcat tgggatcttt cctgaaggtg ggtcccacga cagaacaaac 780 
ttgttgcccc tgaaagcagg tgtggcgatt atggctcttg gttgcatgga taagcatcct 840 
gacgtcaatg ttaagattgt tccctgcggt atgaattatt tccatccaca taagttcagg 900 
tcgagagcgg ttgttgaatt cggtgacccc attgaaatac cgaaggaact agtcgccaag 960 
taccacaacc cggaaacgaa cagagatgca gtgaaagaat tattagatac catatcgaag 
1020 

ggtttacaat ccgttaccgt tacatgttct gattatgaaa ctttgatggt ggttcaaacg 
1080 

ataagaagac tatatatgac acaatttagc accaagttac cgttgccctt gattgtggaa 
1140 

atgaacagaa gaatggtcaa aggttacgaa ttctatagaa acgatcctaa aatagcggac 
1200 

tcgaccaaag atataatggc atataatgcc gccttgagac actataatct tcctgatcac 
1260 

cttgtggagg aggcaaaggt aaatttcgca aaaaacctcg gacttgtttt ttttagatcc 
1320 

arcgggctct gcatcctctt ttcgttagcc atgccaggta tcattatgtt ctcacctgtc 
1380 

ttcatattag ccaagagaat ttctcaagaa aaggcccgta ccgctttgtc caagtctaca 
1440 

gttaaaataa aggctaacga tgtcattgcc acgtggaaaa tcttgattgg gatgggattt 
1500 

gcgcccttgc tttacatctt ttggtccgtt ttaatcactt attacctcag acataaacca 
1560 

tggaataaaa tatatgtttt ttccgggtct tacatctcgt gtgttatagt cacgtattcc 
1620 

gccttaatcg tgggtgatat tggtatggat ggtttcaaat ctttgagacc actggtttta 
1680 

tctcttacat ctccaaaggg cttgcaaaag ctacaaaagg atcgtagaaa tctggcagaa 
1740 

agaataatcg aagttgtaaa taactttgga agcgaattat tccccgattt cgatagtgcc 
1800 

gccctacgtg aagaattcga cgtcatcgat gaagaggaag aagatcgaaa aacctcagaa 
1860 

ttgaatcgca ggaaaatgct aagaaaacag aaaataaaaa gacaagaaaa agattcgtca 
1920 

tcacctatca tcagccaacg tgacaaccac gatgcctatg aacaccataa ccaagattcc 
1980 

gatggcgtct cattggtcaa tagtgacaat tccctctcta acattccatt attctcttct 
2040 

acttttcatc gtaagtcaga gtcttcctta gcttcgacat ccgttgcacc ttcttcttcc 
2100 

tccgaatttg aggtagaaaa cgaaatcttg gaggaaaaaa atggattagc aagtaaaatc 
2160 

gcacaggccg tcttaaacaa gagaattggt gaaaatactg ccagggaaga ggaagaggaa 
2220 

gaagaagagg aagaagaaga agaggaagaa gaagaagaag ggaaagaagg agatgcgtag 
2280 

<210> 230 
<211> 2232 
<212> DNA 

<213> Saccharomyces sp . 
<400> 230 

atgtctgctc ccgctgccga tcataacgct gccaaaccta ttcctcatgt acctcaagcg 60 
tcccgacggt acaaaaattc atacaatgga ttcgtataca atatacatac atggctgtat 120 
gatgtgtctg tatttctgtt taatattttg ttcactattt tcttcagaga aattaaggta 180 
cgtggtgcat ataacgttcc cgaagttggg gtgccaacca tccttgtgtg tgcccctcat 240 
gcaaatcagt tcatcgaccc ggctttggta atgtcgcaaa cccgtttgct gaagacatca 300 
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gcgggaaagt cc:ga:ccag aaigcct-g: t:tgttac:g ctgagtcgag tt^:aagaaa 360 
agarrr.ar.ct crttcr.rt.gg tcacgcaatg ggcggtarrc ccgtgcctag aarrcaggac 420 
aacrrgaagc cagtggarga gaatcrtgag arrtacgctc cggacrtgaa gaaccacccg 4 30 
gaaarcatca agggccgctc caagaaccca cagactacac cagrgaacrr tacgaaaagg 540 
rrttctgcca agtccrtgct r ggar rgccc gactac r r aa gtaacgctca aatcaaggaa 600 
atcccggarg a t gaaacgar aa tcttgtce tctccattca gaacarcgaa atcaaaagtg 660 
grggagctcr tgactaatgg tactaatttt aaatargcag agaaaarcga caatacggaa 720 
actttccaga gtgtttttga tcacttgcat acgaagggct gtgraggtar tttccccgag 730 
ggrggttc tc atgaccgtcc ctcgttacta cccatcaagg caggtgr t gc cattatggct 840 
ctgggcgcag tagccgctga tcctaccatg aaagttgctg ttgtaccctg tggcttgcat 900 
tatttccaca gaaaraaatr cagatctaga gctgttttag aatacggcga acctatagtg 960 
gtggatggga aaratggcga aargtataag gactccccac gtgagaccgt ttccaaacta 
10 2 0 

craaaaaaga tcaccaattc tttgttctct gttaccgaaa argcrccaga ttacgatact 
1080 

ttgatggtca ttcaggcrgc cagaagacta ratcaaccgg raaaagtcag gcracctttg 
1140 

cctgccattc tagaaatcaa cagaaggcta crtttcggtr artccaagrr raaagatgar 
1200 

ccaagaatta ttcacttaaa aaaacrggra catgactaca acaggaaart agartcagtg 
1260 

ggr r raaaag acratcaggt gatgcaarra aaaacta~ca aarragaagc attgaggtgc 
1320 

tttgtaactt rgarcgttcg attgattaaa ttttcrgtct ttgctatact atcgttaccg 
13 80 

ggtrctartc tcrtcactcc aattttcatr atttgtcgcg tatactcaga aaagaaggcc 
144 0 

aaagagggtt taaagaaatc attggttaaa attaagggra ccgatttgtt ggccacatgg 
1500 

aaacttatcg rggcgttaat attggcacca attttatacg tracttacrc gatcttgttg 
1560 

ar tar ttrgg caagaaaaca acacrattgt cgcatctggg ttccttccaa taacgcarrc 
1620 

aracaatttg rctatrtrta tgrgttattg gttttcacca cgrarrcctc rrtaaagacc 
1680 

ggrgaaatcg gtgrtgacct ttrcaaarct ttaagaccac trrrrgrttc tattgtttac 
1740 

cccggtaaga agatcgaaga aatccaaaca acaagaaaga atttaagtct agagttgact 
1800 

gctgtttgta acgatttagg acctttggtt ttccctgatt acgataaatt agcgactgag 
1860 

atattctcra agagagacgg ttatgatgtc tcttctgatg cagagtcttc tataagtcgt 
1920 

atgagtgtac aatctagaag ccgctcttct tctatacatt ctattggctc gctagcttct 
1980 

aacgccctat caagagtgaa ttcaagaggc tcgttgaccg atattccaat tttttctgat 
204C 

gcaaagcaag gtcaatggaa aagtgaaggt gaaactagtg aggatgagga tgaatttgar 
2100 

gagaaaaatc ctgccatagt acaaaccgca cgaagttctg atctaaataa ggaaaacagt 
2160 

cgcaacacaa atatatcttc gaagattgcr tcgctggtaa gacagaaaag agaacacgaa 
2220 

aagaaagaat ga 
2232 

<210> 231 
<211> 1194 
<212> DNA 

<213> Saccharomyces sp . 
<400> 231 

atgctgcatc aaaaaatagc tcataaagtt cgaaaagtcg tcgtcccagg tatttcctta 6C 
rtgatttrcr rccagggatg ccttattcrt ttgtttctcc aactcaccta taagactctt 120 
tactgtagaa atgatataag gaaacaaatt ggtctcaata aaaccaaaag attatttatt 1&0 
gtcttggtat catccat r tt gcargrtgrc gcaccatctg cagrgagaat taccactgaa 240 
aattccagtg ttcctaaagg tacttttttt rragacttga agaagaaaag gattctttct 300 
catctaaagt ccaattcggt ggccatttgc aatcaccaaa tatacacgga ttggatattt 36C 
ttatggtggt tggcttacac atcgaactta ggggctaatg tcttcattat tttaaaaaaa 420 
tcgttggctt ccattcctat cctcggtttc ggtatgagaa actataattt catttttatg 480 



WO 00/18889 



70 



PCT/US99/22231 



agtagaaagt 
aatgcaaggg 
gagagcatat 
atcctattcc 
gctgccaaaa 
agatactcgn 
tactccggtg 
t tagaaggaa 
attccattag 
1020 

gatgctctaa 
109 0 

cactcagtta 
1140 

ccaac tctaa 
1194 



gggcacaaga 
gcgccggctc 
ggaatccgga 
ctgaaggtac 
taggcaaaaa 
tacaaaagtt 
taaaacagga 
aatacccgaa 
aggacgagaa 



caaaataacc 
ac t tgcngga 
ggttattgat 
aaatctcag: 
gccartcaag 
gaagccaagt 
ggaatatggt 
gt tagtcgat 
tgaattttca 



c taagcaaca 
aagtcacctg 
ccaaaacaaa 
gctgatacta 
aatgtgctac 
attgaaagtc 
gagct tatat 
attcacatca 
gaatggc tgt 



gccttgctgg 
agcgcataac 
tccattggcc 
ggcaaaaaag 
tgcctcatcc 
t t tatgatat 
atgggctgaa 
gagcatttga 
ataaaatttg 



tggaaaggta ctattccact ggatcattcg taagtgatcc tgaaacaaac 



ccttgattcg 


540 


tgaggaagga 


600 


atacaatctt 


660 


tgc taaatat 


720 


tacaggcc ta 


780 


tacgatcggc 


840 


gagcatattt 


900 


tgttaaagat 


960 


gagtgagaag 




tgaaacaaac 




gc taatat ta 




ttga 





<210> 232 

<211> 912 

<212> DNA 

<213> Saccharoir.yces sp. 



<4 00> 23 2 
atgagtgtga 
gcaggctgtg 
catttggc tc 
c tcgacgtca 
aatcaccaat 
gctactgcca 
ggtacatat t 
t cagaaaa tg 
tacacgagtg 
ggcaagatcc 
tatggggtct 
aact taacaa 
act t tgaagg 
atcgagtatg 
gtgccttc tg 
aagatgcat t 



taggtaggtt 
gcttittacgg 
agtggat tac 
aggtcgttgg 
ccacct tgga 
agaagtcttt 
tcttagacag 
ttaagaaaaa 
agctgacaat 
ccat tgttcc 
tcaacagagg 
aggacaaaat 
agat tggcta 
ccgctct tea 
tcagcat tag 
aa 



c ttgtat tac 
tgtaatcgee 
tgcgcgttgt 
cgaggagaat 
tatct tcatg 
gaaatacgtc 
ate taaaagg 
caagcgtgc t 
gt tgecttte 
agtggt tgtt 
ctgtatgatt 
tggtgaatt t 
c tctcccgcc 
acatgacaag 
caacgatgtc 



ttgaggtccg 
tctatccttt 
ttttaccatg 
ttggccaaga 
ttaggtagga 
ccctttc tgg 
caagaageca 
ctatgggtt t 
aagaagggtg 
tccaatacca 
gt tagaattt 
gc tgaaaaag 
atcaacgata 
aaagtgaaca 
aatacccata 



tgttggtcgt 
gcacgttaat 
tcatgaaatt 
agecatatat 
t tttcccccc 
gttggt teat 
ttgacacc tt 
ttcctgaggg 
ctttccattt 
gtactttagt 
taaaacctat 
t tagagatca 
caaccctccc 
agaaaatcaa 
acgaaggt tc 



actggege tt 
eggtaagcaa 
gatgettgge 
tatgattgee 
tggt tgcaca 
ggctttgagt 
gaataaaggt 
taccaggtct 
ggcacaacag 
aagtcctaaa 
ttcaaccgag 
aatggttgac 
accacaagct 
gaatgagee t 
atctgtaaaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

912 



<210> 233 
<211> 54 
<212> DNA 

< 2 1 3 > Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 233 

cgcgatttaa atggcgcgcc ctgeaggegg ccgcctgcag ggcgcgccat ttaa 54 



<210> 234 
<211> 32 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleot ide 

<40C> 234 

tcgaggatcc gcggccgcaa gcttcctgca gg 32 



<21C> 235 
<211> 32 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Description of Artificial Sequence : Syn the t i c 
01 igor.ucleotide 

<400> 235 

tcgacctgca ggaagcttgc ggccgcggat cc 32 

<21C> 236 
<211> 32 
<212> DNA 

<213> Artificial Sequence 

< 2 2 0 > 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<4:::0> 236 

t::gacctgca ggaagcttgc ggccgcggat cc 32 

<210> 237 
<111> 32 
<122> DNA 

<113> Artificial Sequence 

< 2 2 0 > 

<121> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<-;0C-> 137 

tcgaggatcc gcggccgcaa gcttcctgca gg 32 

<21C> 23 8 

< 2 1 1 > 3 6 
<211> DNA 

<213> Artificial Sequence 

< 2 2 C > 

<223> Description of Artificial Sequence : Synthetic 
0 1 i g onu cleotide 

<40G> 23 8 

tcgaggatcc gcggccgcaa gcttcctgca ggagct 36 

<21C> 239 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<22C> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<4 00> 239 

cctgcaggaa gcttgcggcc gcggatcc 28 

<2iC> 240 
<2II> 3 6 
<112> DNA 

<21?> Artificial Sequence 
<12 0> 

<111> Description of Artificial Sequence : Synthetic 
01 igonucieotide 

<4CC> 24 0 

t'.-gacctgca ggaagcttgc ggccgcggat ccagct 36 

<110> 241 
<211> 2S 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleotide 

<4G0> 241 

ggatccgcgg ccgcaagctt cctgcagg 28 



